`
hulianwang2014
  • 浏览: 681136 次
文章分类
社区版块
存档分类
最新评论
  • bcworld: 排版成这样,一点看的欲望都没有了
    jfinal

Scala: The Static Language that Feels Dynamic

 
阅读更多
Scala: The Static Language that Feels Dynamic
by Bruce Eckel
June 12, 2011
Summary
The highest complement you can deliver in the Python world is to say that something is "Pythonic" -- that it feels and fits into the Python way of thinking. I never imagined that a static language could feel this way, but Scala does -- and possibly even better.

ADVERTISEMENT

I'm actually glad I waited this long before beginning to learn the language, because they've sorted out a lot of issues in the meantime. In fact, several versions of the language have made breaking changes with previous versions, requiring code rewrites. Some people have found this shocking; an indication that the language is "immature" and "not ready for the enterprise." I find it one of the most promising things about Scala -- it is not determined to become an instant boat anchor by committing to early decisions that are later revealed to be suboptimal, or outright mistakes. Java is the perfect case study, unable to pry its cold, dead fingers from old decisions made badly in a rush to meet an imagined deadline imposed by the Internet. C++ was admirable when it determined to be C-compatible because it brought legions of C programmers into the world of object-oriented programming, but coping with the resulting hurdles is no longer a good use of programmer time.

Indeed, I grew tired of the whole mindset that language design is more important than programmer time; that a programmer should work for the language rather than the reverse. So much so that I thought I had grown out of programming altogether. But now I think I might just have been tired of the old generation of languages and waiting for the next generation -- and especially the forward-thinking around those languages.

If you've read my past writings, you know I am unimpressed with arguments about static type checking for its own sake, which typically come down to "if I can't know X is an int, then the world will collapse!" I've written and seen enough robust code in Python to be unswayed by such histrionics; the payoff for all the hoop-jumping in C++ and Java seems small compared to what can be accomplished using far less, and much clearer, Python code.

Scala is the first language I've seen where static type-checking seems to pay off. Some of its amazing contortional abilities would not, I think, be possible without static type checking. And, as I shall attempt to show in this article, the static checking is relatively unobtrusive -- so much so that programming in Scala almost feels like programming in a dynamic language like Python.

It's Not "Just About Finger Typing"

One retort I've gotten a lot when I discuss the shortcomings of Java compared with a language like Python is "oh, you're just complaining aboutFinger Typing" (as opposed to the "typing" of type-checking).

You can trivialize "finger typing" but in my experience it really does make a big difference when you can take an idea and express it in a few keystrokes versus the veritable vomiting of code necessary to express even the simplest concepts in Java. The real problem is not the number of keystrokes, but the mental load. By the time you've jumped through all those hoops, you've forgotten what you were actually trying to do. Often, the ceremony involved in doing something will dissuade you from trying it.

Scala removes as much of the overhead (and mental load) as possible, so you can express higher-order concepts as quickly as you can type them. I was amazed to discover that in many cases, Scala is even more succinct than Python.

The result of all this is something I've always loved about Python: the level of abstraction is such that you can typically express an idea in code more easily and clearly than you can by making diagrams on a whiteboard. There's no need for that intermediate step.

Let's look at an example. Suppose you'd like to model buildings. We can say:

class Building
val b = new Building

Note the absolute minimum amount of ceremony to create a class -- great when you're just sketching out a solution. If you don't need parens, you don't write them. Avalis immutable, which is preferred in Scala because it makes concurrent code easier to write (there is alsovarfor variables). And notice that I didn't have to put any type information onb, because Scala hastype inferenceso if it can figure out the type for you, it will. No more jumping through hoops to satisfy a lazy language.

If we want theBuildingto know how many square feet it contains, there's an explicit way:

class Building(feet: Int) {
  val squareFeet = feet
}
val b = new Building(100)
println(b.squareFeet)

When youdoneed to provide type information, you just give it after a colon. Note thatprintln()does not require Java'sSystem.outscoping. And class fields default to public -- which is not a big deal if you can stick toval, since that makes it read-only. You can always make themprivateif you want, and Scala has more fine-grained access control than any language I've seen.

If all you want to do is store the argument in the class, as above, Scala makes it easy. Note the addition of thevalin the argument list:

class Building(val feet: Int)
val b = new Building(100)
println(b.feet)

Nowfeetautomatically becomes the field. But it doesn't stop there. Scala has thecase classwhich does even more for you. For one thing, arguments automatically become fields, without sayingvalbefore them:

case class Building(feet: Int)
val b = Building(100)
println(b) // Result: Building(100)

Note thenewis no longer necessary to create an object, the same form that Python uses. And case classes rewritetoStringfor you, to produce nice output.

But wait, there's more! A case class automatically gets an appropriate hashcode and==so you can use it in aMap(the->separates keys from values):

val m = Map(Building(5000) -> "Big", Building(900) -> "Small", Building(2500) -> "Medium")
m(Building(900)) // Result: Small

Note thatMapis available (along withList,Vector,Set,println()and more) as part of the "basic Scala building set" that comes without any imports. Again, this feels like Python.

Inheritance is also succinct. Suppose we want to subclassBuildingto make aHouseclass:

class House(feet: Int) extends Building(feet)
val h = new House(100)
println(h.feet) // Result: 100

Although theextendskeyword is familiar from Java, notice how the base-class constructor is called -- a pretty obvious way to do it, once you've seen it. And again, you don't write any more code than what is absolutely necessary to describe your system.

We can also mix in behavior usingtraits. A trait is much like an interface, except that traits can contain method definitions, which can then be combined when creating a class. Here are several traits to help describe a house:

trait Bathroom
trait Kitchen
trait Bedroom {
  def occupants() = { 1 }
}
class House(feet: Int) extends Building(feet) with Bathroom with Kitchen with Bedroom
var h = new House(100)
val o = h.occupants()
val feet = h.feet

occupants()is a typical Scala method definition: the keyworddeffollowed by the method name, argument list, and then an=and the body of the method in curly braces. The last line in the method produces the return value. More type inference is happening here; if we wanted to be more specific we could specify the return type of the method:

def occupants(): Int = { 1 }

Notice that the methodoccupants()is now part ofHouse, via the mixin effect of traits.

Consider how simple this code is ... and how undistracting. You can talk aboutwhatit's doing, rather than explaining meaningless syntactic requirements as you must do in Java. Creating a model takes no more than a few lines of straightforward code. Wouldn't you rather teach this to a novice programmer than Java?

Functional Programming

Functional programming is often promoted first as a way to do concurrency. However, I've found it to be more fundamentally useful as a way to decompose programming problems. Indeed, C++ has had functional programming virtually from inception, in the form of the STL, without built-in support for concurrency. Python also has significant functional programming libraries but these are independent of its thread support (which, since Python cannot support true parallelism, is primarily for code organization).

Scala has the best of both worlds: true multiprocessor parallelism and a powerful functional programming model -- but one that does not force you to program functionally if it's not appropriate.

When approaching a functional style of programming, I think it's important to go slow and be gentle with yourself. If you push too hard you can get caught up in knots. In fact, I think one of the great benefits of learning functional programming is that it disciplines you to break a problem into small, provable steps -- and to use existing (and proven) code for each of those steps whenever possible. This not only makes your non-functional code better, but it also tends to make everything you write more testable, since functional programming focuses on transforming data (thus, after each transformation, you have something else to test).

Much of functional programming involves performing operations on collections. If, for example, we have aVectorof data:

val v = Vector(1.1, 2.2, 3.3, 4.4)

You can certainly print this using aforloop:

for(n <- v) {
  println(n)
}

The left-arrow can be pronounced "in" --ngets each value inv. This syntax is definitely a step up from having to give every detail as you had to do in C++ and Java (note that Scala does all the creation and type-inference forn). But with functional programming, you extract the looping structures altogether. Scala collections and iterables have a large selection of operations to do this for you. One of the simplest isforeach, which performs an operation on each element in the collection. So the above code becomes:

v.foreach(println)

This actually uses several shortcuts, and to take full advantage of functional programming you first need to understand theanonymous function-- a function without a name. Here's the basic form:

(function parameters) =>function body

The=>is often pronounced "rocket," and it means, "Take the parameters on the left and apply them in the code on the right." An anonymous function can be large; if you have multiple lines, just put the body inside curly braces.

Here's a simple example of an anonymous function:

(x:Int, y:Double) => x * y

The previousforeachcall is, stated explicitly:

v.foreach((n:Double) => println(n))

Usually, you can rely on Scala to do type inference on the argument -- in this case Scala can see thatvcontainsDoubleso it can infer thannis aDouble:

v.foreach((n) => println(n))

If you only have a single argument, you can omit the parentheses:

v.foreach(n => println(n))

When you have a single argument, you can leave out the parameter list altogether and use an underscore in the anonymous function body:

v.foreach(println(_))

And finally, if the function body is just a call to a single function that takes one parameter, you can eliminate the parameter list, which brings us back to:

v.foreach(println)

With all these options and the density possible in functional programming, it's easy to succumb to fits of cleverness and end up writing obtuse code that will cause people to reject the language as too complex. But with some effort and focus on readability this doesn't need to happen.

foreachrelies onside effectsand doesn't return anything. In more typical functional programming you'll perform operations (usually on a collection) and return the result, then perform operations on that result and return something else, etc. One of the most useful functional tools ismap, rather unfortunately named because it's easy to confuse with theMapdata structure.mapperforms an operation on each element in a sequence, just likeforeach, butmapcreates and returns anewsequence from the result. For example:

v.map(n => n * 2)

multiplies each element invby 2 and returns the result, producing:

Vector(2.2, 4.4, 6.6, 8.8)

Again, using shortcuts we can reduce the call to:

v.map(_ * 2)

There are a number of operations that are simple enough to be called without parameters, such as:

v.reverse
v.sum
v.sorted
v.min
v.max
v.size
v.isEmpty

Operations likereverseandsortedreturn a newVectorand leave the original untouched.

It's common to see operations chained together. For example,permutationsproduces an iterator that selects all the different permutations ofv. To display these, we pass the iterator toforeach:

v.permutations.foreach(println)

Another helpful function iszip, which takes two sequences and puts each adjacent element together, like a zipper. This:

Vector(1,2,3).zip(Vector(4,5,6))

produces:

Vector((1,4), (2,5), (3,6))

(Yes, the parenthesized groups within theVectoraretuples, just like in Python).

We can get fancy, and zip the elements ofvtogether with those elements multiplied by 2:

v.zip(v.map(_ * 2))

which produces:

Vector((1.1,2.2), (2.2,4.4), (3.3,6.6), (4.4,8.8))

It's important to know that anonymous functions are a convenience, and very commonly used, but they are not essential for doing functional programming. If anonymous functions are making your code too complicated, you can always define a named function and pass that. For example:

def timesTwo(d: Double) = d * 2

(This uses another Scala shortcut: if the function body fits on one line, you don't need curly braces). This can be used instead of the anonymous function:

v.zip(v.map(timesTwo))

You know you could produce the same effect as the code in this section usingforloops. One of the biggest benefits of functional programming is that it takes care of the fiddly code -- the very code that seems to involve the kind of common errors that easily escape our notice. You're able to use the functional pieces as reliable building blocks, and create robust code more quickly. It certainlyiseasy for functional code to rapidly devolve into unreadability, but with some effort you can keep it clear.

For me, one of the best things about functional programming is the mental discipline that it produces. I find it helps me learn to break problems down into small, testable pieces, and clarifies my analysis. For that reason alone, it is a worthwhile practice.

Pattern Matching

It's amazing how long programmers have put up with stone-age (or more appropriately, assembly-age) language constructs. Theswitchstatement is an excellent example. Seriously, jumping around based on an integral value? How much effort does that really save me? People have begged for things as simple as switching on strings, but this is usually met with "no" from the language designers.

Scala leapfrogs all that with thematchstatement, that looks much like aswitchstatement except that it can select on just about anything. The clarity and code savings ishuge:

// PatternMatching.scala (Run as script: scala PatternMatching.scala)
trait Color
case class Red(saturation: Int) extends Color
case class Green(saturation: Int) extends Color
case class Blue(saturation: Int) extends Color

def matcher(arg:Any): String = arg match {
  case "Chowder" => "Make with clams"
  case x: Int => "An Int with value " + x
  case Red(100) => "Red sat 100"
  case Green(s) => "Green sat " + s
  case c: Color => "Some Color: " + c
  case w: Any => "Whatever: " + w
  case _ => "Default, but Any captures all"
}

val v = Vector(1, "Chowder", Red(100), Green(50), Blue(0), 3.14)
v.foreach(x => println(matcher(x)))

Acase classis especially useful because the pattern matcher can decompose it, as you'll see.

Anyis the root class of all objects including what would be "primitive" types in Java. Sincematcher()takes anAnywe can be confident that it will handle any type that we pass in.

Ordinarily you'd see an opening curly brace right after the=sign, to surround the entire function body in curly braces. In this case, the function body is a single statement so I can take a shortcut and leave off the outer braces.

A pattern-matching statement starts with the object you want to match against (this can be a tuple), thematchkeyword and a body consisting of a sequence ofcasestatements. Eachcasebegins with the match pattern, then a rocket and one or more lines of code which execute upon matching. The last line in eachcaseproduces a return value.

Match expressions can take many forms, only a few of which are shown here. First, you see a simple string match; however Scala has sophisticated regular expression syntax and you can use regular expressions as match expressions, including picking out the pieces into variables. You can capture the result of a match into a variable as incase x: Int. Case classes can produce an exact match as inRed(100)or you can pick out the constructor arguments as inGreen(s). You can also match against traits, as inc: Color.

You have two choices if you want to catch everything else. To capture into a variable, you can matchAny, as incase w: Any. If you don't care what the value is, you can just saycase _.

Note that no "break" statement is necessary at the end of eachcasebody.

Concurrency with Actors

Most of what drove me away from programming were things I had figured out but couldn't convincingly express to others. Things that the Ph.D. computer scientistsoughtto be proving. Such as:

  • Beyond a certain level of program complexity, youmusthave a garbage collector. This could be as simple as any program where objects can belong to more than one collection, but at some point I believe it becomes impossible to manage memory yourself. (C++ people didn't buy this one, although C++0X has hooks now for garbage collection).
  • Checked exceptions are a failed experiment. For small programs they seem like a good idea, but they don't scale up well.
  • Shared-memory concurrency is impossible to get right. In theory the smartest programmer in the world could play whack-a-mole long enough to chase down and patch all the race conditions. But then, all you have to do is change the program a little and everything comes back. Shared-memory is just the wrong model for concurrency.

Note that all these are issues of scale -- things that work in the small start falling apart as programs get bigger or more complex. That's probably why they're hard to argue about, because demonstration examples can be small and obvious.

It turns out I was arguing with the wrong people. Or rather, therightpeople were not arguing about it, they were off fixing the problems. When it comes to concurrency, the right answer is one that you can't screw up: you live behind a safe wall, and messages get safely passed back and forth over the wall. You don't have tothinkabout whether something is going to lock up (not on a low level, anyway); you live in your little walled garden which happens to run with its own thread.

The most object-ish approach to this that I've see is actors. An actor is an object that has an incoming message queue, often referred to as a "mailbox." When someone outside your walled garden wants you to do something, they send you a message that safely appears in your mailbox, and you decide how to handle that message. You can send messages to other actors through their mailboxes. As long as you keep everything within your walls and only communicate through messages, you're safe.

To create an actor, you inherit from theActorclass and define anact()method, which is called to handle mailbox messages. Here's the most trivial example I could think of:

// Bunnies.scala (Run as script: scala Bunnies.scala)
case object Hop
case object Stop

case class Bunny(id: Int) extends scala.actors.Actor {
  this ! Hop // Constructor code
  start()    // ditto
  def act() {
    loop {
      react {
        case Hop =>
          print(this + " ")
          this ! Hop
          Thread.sleep(500)
        case Stop =>
          println("Stopping " + this)
          exit()
      }
    }
  }
}

val bunnies = Range(0,10).map(new Bunny(_))
println("Press RETURN to quit")
readLine
bunnies.foreach(_ ! Stop)

Theact()method is automatically amatchstatement, although this is not built into the language -- Scala magic was used to make theActorlibrary work this way. Because of thematchstatement,case objects work especially well as messages (although, as with anymatchstatement, you can match on virtually anything) -- acase objectis just like acase classexcept that defining one automatically creates a singleton object.

Theloop{ react{construct looks a little strange at first; this is an artifact of the evolution of Scala actors. In the initial design, you only had aloopto open thematchstatement for mailbox messages. But later, in an act of brilliance, it was determined that the concurrency provided by threads could be combined withcooperative multitasking, wherein a single thread of control is passed around -- cooperatively -- among tasks. Each task does something and then explicitly gives up control, which is then passed to the next task. The benefit of cooperative multitasking is that it requires virtually no stack space or context switching time and thus it can scale up -- often to millions of tasks. By combining this with threaded concurrency, you get the best of both worlds: The speed and scalability of cooperative tasks, which are also distributed across as many processors as are available. This all comes transparently. Theloop{ react{construct should be your default choice, and doesn't cost anything. I suspect if they were creating actors from scratch now, this construct would probably have been simplified into justloop{.

Note the two "naked" lines of code at the beginning ofclass Bunny. In Scala, you don't have to put object initialization code inside a special method, and you can put it anywhere inside the body of the class. The first line uses theActoroperator!for sending messages, and in this case the object sends a message to itself, to get things going. Then it callsstart()to begin the actor's message loop. When the actor receives aHopmessage, it prints itself, sends itself anotherHopmessage, then sleeps for half a second. When it gets aStopmessage it callsActor.exit()to stop the event loop.

To create all theBunnyobjects I useRange()to create a sequence from 0 through 9, which is mapped onto calls toBunnyconstructors.readLinewaits for the user to press a carriage return, at which point aStopmessage is sent to eachBunny.

Scala 2.9 includesparallel collections, a powerful way to easily use multiple processors for bulk operations likeforeach,map, etc. Suppose you have a collection of data objects calledtoBeProcessedand an expensive functionprocess. To automatically parallelize the processing, you just add a.par:

val result = toBeProcessed.par.map(obj => process(obj))

If you know that you have objects that can be processed in parallel, this construct makes it effortless. You can find out more about parallel collections in thisScala Days 2010 video.

Even more powerful is theakka library, which builds concurrent systems that are, among other things, transparently remoteable.

Scala is the best solution for concurrent programming that I've seen, and it keeps getting better.

But Scala is so Complex!

Scala does suffer from the mistaken idea that it's complicated, and for good reason. Many early adopters have been language enthusiasts who love to show how clever they are, and this only confuses beginners. But you can see from the code above that learning Scala should be a lot easier than learning Java! There's none of the horrible Java ceremony necessary just to write "Hello, world!" -- in Scala you can actually create a one-line script that says:

println("Hello, world!")

Or you can run it in Scala's interactive interpreter, which allows you to easily experiment with the language.

Or consider opening a file and processing the contents (something that's also very high-ceremony in Java):

val fileLines = io.Source.fromFile("Colors.scala").getLines.toList
fileLines.foreach(println)

(The "processing" in this case is just printing each line). The simplicity of the code required to open a file and read all the lines, combined with the power of the language, suggests that Scala can be very useful for solving scripting problems (also Scala has strong native support for XML).

We can make some small modifications to create a word-count program:

for(file <- args) {
  print(file + ": ")
  val contents = io.Source.fromFile(file).getLines.mkString
  println(contents.split(" ").length)
}

argsis available to all programs and contains all the command-line arguments, so this program steps through them one at a time. Here, we split words at white space but Scala also has regular expressions.

Itispossible to write complex code that requires expertise to unravel. But it's totally unnecessary to write such code when teaching beginners. Indeed, if taught right a person should come away from Scala thinking that it is a simpler, more consistent language than the alternatives.

How to Learn

All the Scala tutorials I encountered assume that you are a Java programmer. This is unfortunate because, as I've shown above, Scalacouldbe taught as a first language in a much less-confusing way than we are forced to teach Java. But it does make it easier for writers to assume that you know how to program, and in Java.

  • I found Daniel Spiewak's series of blog posts titledScala for Java Refugeesto be a very helpful starting point.
  • Next, I readProgramming in Scala, 2nd Editionby Martin Odersky, Lex Spoon, and Bill Venners. Odersky is the creator of the language so this is the authoritative book to read. It definitely assumes you're a Java programmer and it's not a particularly introductory book but it's worth pushing through to give you a fuller perspective on what Scala can do.
  • The Scala language website has anice set of free tutorialswhich I've been finding very useful.
  • I've also looked for different views by reading other books, for exampleProgramming Scalaby Venkat Subramaniam. Although this one definitely suffers from too much cleverness (in the first chapter he gives a very obtuse example, then tells you not to worry about it, then says to study and understand it) and should certainly not be your first book, I found the different perspective to be helpful.
  • After reading those, I took theStairway to Scalaworkshop by Bill Venners (coauthor ofProgramming in Scala) and Dick Wall (leader of the Java Posse) in Ann Arbor in May. There's one coming up August 8-12 in San Francisco; you can find out more and register here:http://www.artima.com/shop/stairway_to_scala. This is not an especially introductory class; you should have programming experience -- ideally in Java, because that's what they refer to most -- and I strongly recommend readingProgramming in Scalabeforehand, and other books if you can manage it -- even if you don't understand it in depth, exposing your brain early will allow some concepts to become more comfortable. It made a huge difference for me to do as much study as I did before the seminar.
  • There are also three different "tents" at the "Scala Campsite" during theProgramming Summer Camp.

There are language features that I have only touched on here, or not covered at all. What I've shown should either give you the urge to learn and use Scala, or it will have you running back to the safety of your favorite language.


Bruce Eckel (www.BruceEckel.com) provides development assistance in Python with user interfaces in Flex. He is the author of Thinking in Java (Prentice-Hall, 1998, 2nd Edition, 2000, 3rd Edition, 2003, 4th Edition, 2005), the Hands-On Java Seminar CD ROM (available on the Web site), Thinking in C++ (PH 1995; 2nd edition 2000, Volume 2 with Chuck Allison, 2003), C++ Inside & Out (Osborne/McGraw-Hill 1993), among others. He's given hundreds of presentations throughout the world, published over 150 articles in numerous magazines, was a founding member of the ANSI/ISO C++ committee and speaks regularly at conferences.

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics