HADOOP 그리고 SPARK !! 이젠 SCALA를 알아야할 때!

We`ve exprienced a littel bit about hadoop until the last blog. I am going to introduce Spark later post.

Before looking around Spark, we need to have some idea about Scala laguage..

Here is simple explanation about Scala language. Today we will explore a little bit.

Step 1. Installation and simple guide.

From Official site, we can download Scala and install it as follows: 

Official site of Scala is :  https://www.scala-lang.org/

We can download bitecode and exract it. And move it to $ /usr/local/share/scala.

Path and Environment

For quick access, add scala and scalac to your path. For example:

 

Environment Variable Value (example)
Unix $SCALA_HOME /usr/local/share/scala
  $PATH $PATH:$SCALA_HOME/bin

set the Environment and run some simple code.

Run it interactively! (코딩을 하면서 바로바로 볼수 있는 인터페이스)

The scala command starts an interactive shell where Scala expressions are interpreted interactively.


  1. > scala
  2. This is a Scala shell.
  3. Type in expressions to have them evaluated.
  4. Type :help for more information.
  5.  
  6. scala> object HelloWorld {
  7. | def main(args: Array[String]) {
  8. | println("Hello, world!")
  9. | }
  10. | }
  11. defined module HelloWorld
  12.  
  13. scala> HelloWorld.main(null)
  14. Hello, world!
  15.  
  16. scala>:q
  17. >

The shortcut :q stands for the internal shell command :quit used to exit the interpreter.

Compile it!

The scalac command compiles one (or more) Scala source file(s) and generates Java bytecode which can be executed on any standard JVM. The Scala compiler works similarly to javac, the Java compiler of the Java SDK.


  1. > scalac HelloWorld.scala

By default scalac generates the class files into the current working directory. You may specify a different output directory using the -d option.


  1. > scalac -d classes HelloWorld.scala

Execute it!

The scala command executes the generated bytecode with the appropriate options:


  1. > scala HelloWorld

scala allows us to specify command options, such as the -classpath (alias -cp) option:


  1. > scala -cp classes HelloWorld

The argument of the scala command has to be a top-level object. If that object extends trait App, then all statements contained in that object will be executed; otherwise you have to add a method main which will act as the entry point of your program.

Here is how the “Hello, world” example looks like using the App trait:


  1. object HelloWorld extends App {
  2. println("Hello, world!")
  3. }

Here is more detail about Scala.

Step 2. Learn to use the Scala interpreter

The easiest way to get started with Scala is by using the Scala interpreter, which is an interactive “shell” for writing Scala expressions and programs. Simply type an expression into the interpreter and it will evaluate the expression and print the resulting value. The interactive shell for Scala is simply called scala. You use it like this:

$ scala
This is an interpreter for Scala.
Type in expressions to have them evaluated.
Type :help for more information.

scala> 

After you type an expression, such as 1 + 2, and hit return:

scala> 1 + 2

The interpreter will print:

unnamed0: Int = 3

This line includes:

  • an automatically assigned or user-defined name to refer to the computed value (unnamed0)
  • a colon (:)
  • the type of the expression and its resulting value (Int)
  • an equals sign (=)
  • the value resulting from evaluating the expression (3)

The type Int names the class Int in the package scala. Values of this class are implemented just like Java's int values. In fact, Scala treats int as an alias for scala.Int. More generally, all of Java's primitive types are defined as aliases for classes in the scala package. For example, if you type boolean in a Scala program, the type you'll actually get is scala.Boolean. Or if you type float, you'll get scala.Float. When you compile your Scala code to Java bytecodes, however, Scala will compile these types to Java's primitive types where possible to get the performance benefits of Java's primitive types.

The unnamedX identifier may be used in later lines. For instance, since unnamed0 was set to 3 previously, unnamed0 * 3 will be 9:

scala> unnamed0 * 3
unnamed1: Int = 9

To print the necessary, but not sufficient, Hello, world! greeting, type:

scala> println("Hello, world!")
Hello, world!
unnamed2: Unit = ()

The type of the result here is scala.Unit, which is Scala's analogue to void in Java. The main difference between Scala's Unit and Java's void is that Scala lets you write down a value of type Unit, namely (), whereas in Java there is no value of type void. (In other words, just as 1, 2, and 3, are potential values of type int in both Scala and Java, () is the one and only value of type Unit in Scala. By contrast, there are no values of type void in Java.) Except for this, Unit and void are equivalent. In particular, every void-returning method in Java is mapped to a Unit-returning method in Scala.

Step 3. Define some variables

Scala differentiates between vals, variables that are assigned once and never change, and vars, variables that may change over their lifetime. Here's a val definition:

scala> val msg = "Hello, world!"
msg: java.lang.String = Hello, world!

This introduces msg as a name for the value "Hello world!". The type of the value above is java.lang.String, because Scala strings are also Java strings. (In fact every Java class is available in Scala.)

This example also points out an important and very useful feature of Scala: type inference. Notice that you never said the word java.lang.String or even String in the val definition. The Scala interpreter inferred the val's type to be the type of its initialization assignment. Since msg was initialized to "Hello, world!", and since "Hello, world!" is type java.lang.String, the compiler gave msg the type java.lang.String.

When the Scala interpreter (or compiler) can infer a type, it is usually best to let it do so rather than fill the code with unnecessary, explicit type annotations. You can, however, specify a type explicitly if you wish. (For example, you may wish to explicitly specify the types of public members of classes for documentation purposes.) In contrast to Java, where you specify a variable&8217;s type before its name, in Scala you specify a variable's type after its name, separated by a colon. For example:

scala> val msg2: java.lang.String = "Hello again, world!"
msg2: java.lang.String = Hello, world!

Or, since java.lang types are visible with their simple names in Scala programs, simply:

scala> val msg3: String = "Hello yet again, world!"
msg3: String = Hello yet again, world!

Going back to our original msg, now that it is defined, you can then use the msg value as you'd expect, as in:

scala> println(msg)
Hello, world!
unnamed3: Unit = ()

What you can't do with msg, given that it is a val not a var, is reassign it. For example, see how the interpreter complains when you attempt the following:

scala> msg = "Goodbye cruel world!"
:5 error: assignment to non-variable 
  val unnamed4 = {msg = "Goodbye cruel world!";msg}

If reassignment is what you want, you'll need to use a var, as in:

scala> var greeting = "Hello, world!"
greeting: java.lang.String = Hello, world!

Since greeting is a variable (defined with var) not a value (defined with val), you can reassign it later. If you are feeling grouchy later, for example, you could change your greeting to:

scala> greeting = "Leave me alone, world!"
greeting: java.lang.String = Leave me alone, world!

Step 4. Define some methods

Now that you've worked with Scala variables, you'll probably want to write some methods. Here's how you do that in Scala:

scala> def max(x: Int, y: Int): Int = if (x < y) y else x
max: (Int,Int)Int

Method definitions start with def instead of val or var. The method's name, in this case max, is followed by a list of parameters in parentheses. A type annotation must follow every method parameter, preceded by a colon in the Scala way, because the Scala compiler (and interpreter, but from now on we'll just say compiler) does not infer method parameter types. In this example, the method named max takes two parameters, x and y, both of type Int. After the close parenthesis of max's parameter list you'll find another “: Int” type specifier. This one defines the result type of the max method itself.

Sometimes the Scala compiler will require you to specify the result type of a method. If the method is recursive1, for example, you must explicitly specify the method result type. In the case of max however, you may leave the result type specifier off and the compiler will infer it. Thus, the max method could have been written:

scala> def max2(x: Int, y: Int) = if (x < y) y else x
max2: (Int,Int)Int

Note that you must always explicitly specify a method's parameter types regardless of whether you explicitly specify its result type.

The name, parameters list, and result type, if specified, form a method's signature. After the method's signature you must put an equals sign and then the body of the method. Since max's body consists of just one statement, you need not place it inside curly braces, but you can if you want. So you could also have written:

scala> def max3(x: Int, y: Int) = { if (x < y) y else x }
max3: (Int,Int)Int

If you want to put more than one statement in the body of a method, you must enclose them inside curly braces.

Once you have defined a method, you can call it by name, as in:

scala> max(3, 5)
unnamed6: Int = 5

Note that if a method takes no parameters, as in:

scala> def greet() = println("Hello, world!")
greet: ()Unit

You can call it with or without parentheses:

scala> greet()
Hello, world!
unnamed7: Unit = ()

scala> greet
Hello, world!
unnamed8: Unit = ()

The recommended style guideline for such method invocations is that if the method may have side effects4, you should provide the parentheses even if the compiler doesn't require them. Thus in this case, since the greet method prints to the standard output, it has side effects and you should invoke it with parentheses to alert programmers looking at the code.

Step 5. Write some Scala scripts

Although Scala is designed to help developers build large systems, it also scales down nicely such that it feels natural to write scripts in it. A script is just a sequence of statements in a file that will be executed sequentially. (By the way, if you're still running the scala interpreter, you can exit it by entering the :quit command.) Put this into a file named hello.scala:

println("Hello, world, from a script!")

then run:

>scala hello.scala

And you should get yet another greeting:

Hello, world, from a script!

Command line arguments to a Scala script are available via a Scala array named args. In Scala, arrays are zero based, as in Java, but you access an element by specifying an index in parentheses rather than square brackets. So the first element in a Scala array named steps is steps(0), not steps[0]. To try this out, type the following into a new file named helloarg.scala:

// Say hello to the first argument
println("Hello, " + args(0) + "!")

then run:

>scala helloarg.scala planet

In this command, "planet" is passed as a command line argument, which is accessed in the script as args(0). Thus, you should see:

Hello, planet!

Note also that this script included a comment. As with Java, the Scala compiler will ignore characters between // and the next end of line, as well as any characters between /* and */. This example also shows strings being concatenated with the + operator. This works as you'd expect. The expression "Hello, " + "world!" will result in the string "Hello, world!".

By the way, if you're on some flavor of Unix, you can run a Scala script as a shell script by prepending a “pound bang” directive at the top of the file. For example, type the following into a file named helloarg:

#!/bin/sh
exec scala $0 $@
!#
// Say hello to the first argument
println("Hello, " + args(0) + "!")

The initial #!/bin/sh must be the very first line in the file. Once you set its execute permission:

>chmod +x helloarg

You can run the Scala script as a shell script by simply saying:

>./helloarg globe

Which should yield:

Hello, globe!

Step 6. Loop with while, decide with if

You write while loops in Scala in much the same way you do in Java. Try out a while by typing the following into a file name printargs.scala:

var i = 0
while (i < args.length) {
  println(args(i))
  i += 1
}

This script starts with a variable definition, var i = 0. Type inference gives i the type scala.Int, because that is the type of its initial value, 0. The while construct on the next line causes the block (the code between the curly braces) to be repeatedly executed until the boolean expression i < args.length is false. args.length gives the length of the args array, similar to the way you get the length of an array in Java. The block contains two statements, each indented two spaces, the recommended indentation style for Scala. The first statement, println(args(i)), prints out the ith command line argument. The second statement, i += 1, increments i by one. Note that Java's ++i and i++ don't work in Scala. To increment in Scala, you need to say either i = i + 1 or i += 1. Run this script with the following command:

>scala printargs.scala Scala is fun

And you should see:

Scala
is
fun

For even more fun, type the following code into a new file named echoargs.scala:

var i = 0
while (i < args.length) {
  if (i != 0)
    print(" ")
  print(args(i))
  i += 1
}
println()

In this version, you've replaced the println call with a print call, so that all the arguments will be printed out on the same line. To make this readable, you've inserted a single space before each argument except the first via the if (i != 0) construct. Since i != 0 will be false the first time through the while loop, no space will get printed before the initial argument. Lastly, you've added one more println to the end, to get a line return after printing out all the arguments. Your output will be very pretty indeed.

If you run this script with the following command:

>scala echoargs.scala Scala is even more fun

You'll get:

Scala is even more fun

Note that in Scala, as in Java, you must put the boolean expression for a while or an if in parentheses. (In other words, you can't say in Scala things like if i < 10 as you can in a language such as Ruby. You must say if (i < 10) in Scala.) Another similarity to Java is that if a block has only one statement, you can optionally leave off the curly braces, as demonstrated by the if statement in echoargs.scala. And although you haven't seen many of them, Scala does use semi-colons to separate statements as in Java, except that in Scala the semi-colons are very often optional, giving some welcome relief to your right pinky finger. If you had been in a more verbose mood, therefore, you could have written the echoargs.scala script as follows:

var i = 0;
while (i < args.length) {
  if (i != 0) {
    print(" ");
  }
  print(args(i));
  i += 1;
}
println();

If you type the previous code into a new file named echoargsverbosely.scala, and run it with the command:

> scala echoargsverbosely.scala In Scala semicolons are often optional

You should see the output:

In Scala semicolons are often optional

Note that because you had no parameters to pass to the println method, you could have left off the parentheses and the compiler would have been perfectly happy. But given the style guideline that you should always use parentheses when calling methods that may have side effects—coupled with the fact that by printing to the standard output, println will indeed have side effects—you specified the parentheses even in the concise echoargs.scala version.

One of the benefits of Scala that you can begin to see with these examples, is that Scala gives you the conciseness of a scripting language such as Ruby or Python, but without requiring you to give up the static type checking of more verbose languages like Java or C++. Scala's conciseness comes not only from its ability to infer both types and semicolons, but also its support for the functional programming style, which is discussed in the next step.

Step 7. Iterate with foreach and for

Although you may not have realized it, when you wrote the while loops in the previous step, you were programming in an imperative style. In the imperative style, which is the style you would ordinarily use with languages like Java, C++, and C, you give one imperative command at a time, iterate with loops, and often mutate state shared between different functions or methods. Scala enables you to program imperatively, but as you get to know Scala better, you'll likely often find yourself programming in a more functional style. In fact, one of the main aims of the Scalazine will be to help you become as competent at functional programming as you are at imperative programming, using Scala as a vehicle.

One of the main characteristics of a functional language is that functions are first class constructs, and that's very true in Scala. For example, another (far more concise) way to print each command line argument is:

args.foreach(arg => println(arg))

In this code, you call the foreach method on args, and pass in a function. In this case, you're passing in an anonymous function (one with no name), which takes one parameter named arg. The code of the anonymous function is println(arg). If you type the above code into a new file named pa.scala, and execute with the command:

scala pa.scala Concise is nice

You should see:

Concise
is
nice

In the previous example, the Scala interpreter infers type of arg to be String, since Strings are what the array on which you're calling foreach is holding. If you'd prefer to be more explicit, you can mention the type name, but when you do you'll need to wrap the argument portion in parentheses (which is the normal form of the syntax anyway). Try typing this into a file named epa.scala.

args.foreach((arg: String) => println(arg))

Running this script has the same behavior as the previous one. With the command:

scala epa.scala Explicit can be nice too

You'll get:

Explicit
can
be
nice
too

If instead of an explicit mood, you're in the mood for even more conciseness, you can take advantage of a special case in Scala. If an anonymous function consists of one method application that takes a single argument, you need not explicitly name and specify the argument. Thus, the following code also works:

args.foreach(println)

To summarize, the syntax for an anonymous function is a list of named parameters, in parentheses, a right arrow, and then the body of the function. This syntax is illustrated in Figure 1.

 



Figure 1. The syntax of a Scala anonymous function.

 

Now, by this point you may be wondering what happened to those trusty for loops you have been accustomed to using in imperative languages such as Java. In an effort to guide you in a functional direction, only a functional relative of the imperative for (called a for comprehension) is available in Scala. While you won't see their full power and expressiveness in this article, we will give you a glimpse. In a new file named forprintargs.scala, type the following:

for (arg <- args)
  println(arg)

The parentheses after the for in this for comprehension contain arg <- args. To the left of the <- symbol, which you can say as “in”, is a declaration of a new \@val@ (not a \@var@) named arg. To the right of <- is the familiar args array. When this code executes, arg will be assigned to each element of the args array and the body of the for, println(arg), will be executed. Scala's for comprehensions can do much more than this, but this simple form is similar in functionality to Java 5's:

// ...
for (String arg : args) {     // Remember, this is Java, not Scala
    System.out.println(arg);
}
// ...

or Ruby's

for arg in ARGV   # Remember, this is Ruby, not Scala
  puts arg
end

When you run the forprintargs.scala script with the command:

scala forprintargs.scala for is functional

You should see:

for
is
functional

Step 8. Parameterize Arrays with types

In addition to being functional, Scala is object-oriented. In Scala, as in Java, you define a blueprint for objects with classes. From a class blueprint, you can instantiate objects, or class instances, by using new. For example, the following Scala code instantiates a new String and prints it out:

val s = new String("Hello, world!")
println(s)

In the previous example, you parameterize the String instance with the initial value "Hello, world!". You can think of parameterization as meaning configuring an instance at the point in your program that you create that instance. You configure an instance with values by passing objects to a constructor of the instance in parentheses, just like you do when you create an instance in Java. If you place the previous code in a new file named paramwithvalues.scala and run it with scala paramswithvalues.scala, you'll see the familiar Hello, world! greeting printed out.

In addition to parameterizing instances with values at the point of instantiation, you can in Scala also parameterize them with types. This kind of parameterization is akin to specifying a type in angle brackets when instantiating a generic type in Java 5 and beyond. The main difference is that instead of the angle brackets used for this purpose in Java, in Scala you use square brackets. Here's an example:

val greetStrings = new Array[String](3)

greetStrings(0) = "Hello"
greetStrings(1) = ", "
greetStrings(2) = "world!\n"

for (i <- 0 to 2)
  print(greetStrings(i))

In this example, greetStrings is a value of type Array[String] (say this as, “an array of string”) that is initialized to length 3 by passing the value 3 to a constructor in parentheses in the first line of code. Type this code into a new file called paramwithtypes.scala and execute it with scala paramwithtypes.scala, and you'll see yet another Hello, world! greeting. Note that when you parameterize an instance with both a type and a value, the type comes first in its square brackets, followed by the value in parentheses.

Had you been in a more explicit mood, you could have specified the type of greetStrings explicitly like this:

val greetStrings: Array[String] = new Array[String](3)
// ...

Given Scala's type inference, this line of code is semantically equivalent to the actual first line of code in paramwithtypes.scala. But this form demonstrates that while the type parameterization portion (the type names in square brackets) form part of the type of the instance, the value parameterization part (the values in parentheses) do not. The type of greetStrings is Array[String], not Array[String](3).

The next three lines of code in paramwithtypes.scala initializes each element of the greetStrings array:

// ...
greetStrings(0) = "Hello"
greetStrings(1) = ", "
greetStrings(2) = "world!\n"
// ...

As mentioned previously, arrays in Scala are accessed by placing the index inside parentheses, not square brackets as in Java. Thus the zeroeth element of the array is greetStrings(0), not greetStrings[0] as in Java.

These three lines of code illustrate an important concept to understand about Scala concerning the meaning of val. When you define a variable with val, the variable can't be reassigned, but the object to which it refers could potentially still be mutated. So in this case, you couldn't reassign greetStrings to a different array; greetStrings will always point to the same Array[String] instance with which it was initialized. But you can change the elements of that Array[String] over time, so the array itself is mutable.

The final two lines in paramwithtypes.scala contain a for comprehension that prints out each greetStrings array element in turn.

// ...
for (i <- 0 to 2)
  print(greetStrings(i))

The first line of code in this for comprehension illustrates another general rule of Scala: if a method takes only one parameter, you can call it without a dot or parentheses. to is actually a method that takes one Int argument. The code 0 to 2 is transformed into the method call 0.to(2). (This to method actually returns not an Array but a Scala iterator that returns the values 0, 1, and 2.) Scala doesn't technically have operator overloading, because it doesn't actually have operators in the traditional sense. Characters such as +, -, *, and /, have no special meaning in Scala, but they can be used in method names. Thus, the expression 1 + 2, which was the first Scala code you typed into the interpreter in Step 1, is essential in meaning to 1.+(2), where + is the name of a method defined in class scala.Int.

Another important idea illustrated by this example will give you insight into why arrays are accessed with parentheses in Scala. Scala has fewer special cases than Java. Arrays are simply instances of classes like any other class in Scala. When you apply parentheses to a variable and pass in some arguments, Scala will transform that into an invocation of a method named apply. So greetStrings(i) gets transformed into greetStrings.apply(i). Thus accessing the element of an array in Scala is simply a method call like any other method call. What's more, the compiler will transform any application of parentheses with some arguments on any type into an apply method call, not just arrays. Of course it will compile only if that type actually defines an apply method. So it's not a special case; it's a general rule.

Similarly, when an assignment is made to a variable that is followed by some arguments in parentheses, the compiler will transform that into an invocation of an update method that takes two parameters. For example,

greetStrings(0) = "Hello" 

will essentially be transformed into

greetStrings.update(0, "Hello")

Thus, the following Scala code is semantically equivalent to the code you typed into paramwithtypes.scala:

val greetStrings = new Array[String](3)

greetStrings.update(0, "Hello")
greetStrings.update(1, ", ")
greetStrings.update(2, "world!\n")

for (i <- 0.to(2))
  print(greetStrings.apply(i))

Scala achieves a conceptual simplicity by treating everything, from arrays to expressions, as objects with methods. You as the programmer don't have to remember lots of special cases, such as the differences in Java between primitive and their corresponding wrapper types, or between arrays and regular objects. However, it is significant to note that in Scala this uniformity does not usually come with a performance cost as it often has in other languages that have aimed to be pure in their object orientation. The Scala compiler uses Java arrays, primitive types, and native arithmetic where possible in the compiled code. Thus Scala really does give you the best of both worlds in this sense: the conceptual simplicity of a pure object-oriented language with the runtime performance characteristics of language that has special cases for performance reasons.

Step 9. Use Lists and Tuples

One of the big ideas of the functional style of programming is that methods should not have side effects. The only effect of a method should be to compute the value or values that are returned by the method. Some benefits gained when you take this approach are that methods become less entangled, and therefore more reliable and reusable. Another benefit of the functional style in a statically typed language is that everything that goes into and out of a method is checked by a type checker, so logic errors are more likely to manifest themselves as type errors. To apply this functional philosophy to the world of objects, you would make objects immutable. A simple example of an immutable object in Java is String. If you create a String with the value "Hello, ", it will keep that value for the rest of its lifetime. If you later call concat("world!") on that String, it will not add "world!" to itself. Instead, it will create and return a brand new String with the value Hello, world!".

As you've seen, a Scala Array is a mutable sequence of objects that all share the same type. An Array[String] contains only Strings, for example. Although you can't change the length of an Array after it is instantiated, you can change its element values. Thus, Arrays are mutable objects. An immutable, and therefore more functional-oriented, sequence of objects that share the same type is Scala's List. As with Arrays, a List[String] contains only Strings. Scala's List, scala.List, differs from Java's java.util.List type in that Scala Lists are always immutable (whereas Java Lists can be mutable). But more importantly, Scala's List is designed to enable a functional style of programming. Creating a List is easy, you just say:

val oneTwoThree = List(1, 2, 3)

This establishes a new \@val@ named oneTwoThree, which initialized with a new List[Int] with the integer element values 1, 2 and 3. (You don't need to say new List because “List” is defined as a factory method on the scala.List singleton object. More on Scala's singleton object construct in Step 11.) Because Lists are immutable, they behave a bit like Java Strings in that when you call a method on one that might seem by its name to imply the List will be mutated, it instead creates a new List with the new value and returns it. For example, List has a method named ::: that concatenates a passed List and the List on which ::: was invoked. Here's how you use it:

val oneTwo = List(1, 2)
val threeFour = List(3, 4)
val oneTwoThreeFour = oneTwo ::: threeFour
println(oneTwo + " and " + threeFour + " were not mutated.")
println("Thus, " + oneTwoThreeFour + " is a new List.")

Type this code into a new file called listcat.scala and execute it with scala listcat.scala, and you should see:

List(1, 2) and List(3, 4) were not mutated.
Thus, List(1, 2, 3, 4) is a new List.

Enough said.2

Perhaps the most common operator you'll use with Lists is ::, which is pronounced “cons.” Cons prepends a new element to the beginning of an existing List, and returns the resulting List. For example, if you type the following code into a file named consit.scala:

val twoThree = List(2, 3)
val oneTwoThree = 1 :: twoThree
println(oneTwoThree)

And execute it with scala consit.scala, you should see:

List(1, 2, 3)

Given that a shorthand way to specify an empty List is Nil, one way to initialize new Lists is to string together elements with the cons operator, with Nil as the last element. For example, if you type the following code into a file named consinit.scala:

val oneTwoThree = 1 :: 2 :: 3 :: Nil
println(oneTwoThree)

And execute it with scala consinit.scala, you should again see:

List(1, 2, 3)

Scala's List is packed with useful methods, many of which are shown in Table 1.

 

What it Is What it Does
List() Creates an empty List
Nil Creates an empty List
List("Cool", "tools", "rule") Creates a new List[String] with the three values "Cool", "tools", and "rule"
val thrill = "Will" :: "fill" :: "until" :: Nil Creates a new List[String] with the three values "Will", "fill", and "until"
thrill(2) Returns the 2nd element (zero based) of the thrill List (returns "until")
thrill.count(s => s.length == 4) Counts the number of String elements in thrill that have length 4 (returns 2)
thrill.drop(2) Returns the thrill List without its first 2 elements (returns List("until"))
thrill.dropRight(2) Returns the thrill List without its rightmost 2 elements (returns List("Will"))
thrill.exists(s => s == "until") Determines whether a String element exists in thrill that has the value "until" (returns true)
thrill.filter(s => s.length == 4) Returns a List of all elements, in order, of the thrill List that have length 4 (returns List("Will", "fill"))
thrill.forall(s => s.endsWith("l")) Indicates whether all elements in the thrill List end with the letter "l" (returns true)
thrill.foreach(s => print(s)) Executes the print statement on each of the Strings in the thrill List (prints "Willfilluntil")
thrill.foreach(print) Same as the previous, but more concise (also prints "Willfilluntil")
thrill.head Returns the first element in the thrill List (returns "Will")
thrill.init Returns a List of all but the last element in the thrill List (returns List("Will", "fill"))
thrill.isEmpty Indicates whether the thrill List is empty (returns false)
thrill.last Returns the last element in the thrill List (returns "until")
thrill.length Returns the number of elements in the thrill List (returns 3)
thrill.map(s => s + "y") Returns a List resulting from adding a "y" to each String element in the thrill List (returns List("Willy", "filly", "untily"))
thrill.remove(s => s.length == 4) Returns a List of all elements, in order, of the thrill List except those that have length 4 (returns List("until"))
thrill.reverse Returns a List containing all element of the thrill List in reverse order (returns List("until", "fill", "Will"))
thrill.sort((s, t) => s.charAt(0).toLowerCase < t.charAt(0).toLowerCase) Returns a List containing all element of the thrill List in alphabetical order of the first character lowercased (returns List("fill", "until", "Will"))
thrill.tail Returns the thrill List minus its first element (returns List("fill", "until"))

 

Table 1. Some List methods and usages.

Besides List, one other ordered collection of object elements that's very useful in Scala is the tuple. Like Lists, tuples are immutable, but unlike Lists, tuples can contain different types of elements. Thus whereas a list might be a List[Int] or a List[String], a tuple could contain both an Int and a String at the same time. Tuples are very useful, for example, if you need to return multiple objects from a method. Whereas in Java, you would often create a JavaBean-like class to hold the multiple return values, in Scala you can simply return a tuple. And it is simple: to instantiate a new tuple that holds some objects, just place the objects in parentheses, separated by commas. Once you have a tuple instantiated, you can access its elements individually with a dot, underscore, and the one-based index of the element. For example, type the following code into a file named luftballons.scala:

val pair = (99, "Luftballons")
println(pair._1)
println(pair._2)

In the first line of this code, you create a new tuple that contains an Int with the value 99 as its first element, and a String with the value "Luftballons" as its second element. Scala infers the type of the tuple to be Tuple2[Int, String], and gives that type to the variable pair as well. In the second line, you access the _1 field, which will produce the first element, 99. The . in the second line is the same dot you'd use to access a field or invoke a method. In this case you are accessing a field named _1. If you run this script with scala luftballons.scala, you'll see:

99
Luftballons

The actual type of a tuple depends upon the number and of elements it contains and the types of those elements. Thus, the type of (99, "Luftballons") is Tuple2[Int, String]. The type of ('u', 'r', "the", 1, 4, "me") is Tuple6[Char, Char, String, Int, Int, String].

Step 10. Use Sets and Maps

Because Scala aims to help you take advantage of both functional and imperative styles, its collections libraries make a point to differentiate between mutable and immutable collection classes. For example, Arrays are always mutable, whereas Lists are always immutable. When it comes to Sets and Maps, Scala also provides mutable and immutable alternatives, but in a different way. For Sets and Maps, Scala models mutability in the class hierarchy.

For example, the Scala API contains a base trait for Sets, where a trait is similar to a Java interface. (You'll find out more about traits in Step 12.) Scala then provides two subtraits, one for mutable Sets, and another for immutable Sets. As you can see in Figure 2, these three traits all share the same simple name, Set. Their fully qualified names differ, however, because they each reside in a different package. Concrete Set classes in the Scala API, such as the HashSet classes shown in Figure 2, extend either the mutable or immutable Set trait. (Although in Java you implement interfaces, in Scala you “extend” traits.) Thus, if you want to use a HashSet, you can choose between mutable and immutable varieties depending upon your needs.

 



Figure 2. Class hierarchy for Scala Sets.

 

To try out Scala Sets, type the following code into a file named jetset.scala:

import scala.collection.mutable.HashSet

val jetSet = new HashSet[String]
jetSet += "Lear"
jetSet += ("Boeing", "Airbus")
println(jetSet.contains("Cessna"))

The first line of jetSet.scala imports the mutable HashSet. As with Java, the import allows you to use the simple name of the class, HashSet, in this source file. After a blank line, the third line initializes jetSet with a new HashSet that will contain only Strings. Note that just as with Lists and Arrays, when you create a Set, you need to parameterize it with a type (in this case, String), since every object in a Set must share the same type. The subsequent two lines add three objects to the mutable Set via the += method. As with most other symbols you've seen that look like operators in Scala, += is actually a method defined on class HashSet. Had you wanted to, instead of writing jetSet += "Lear", you could have written jetSet.+=("Lear"). Because the += method takes a variable number of arguments, you can pass one or more objects at a time to it. For example, jetSet += "Lear" adds one String to the HashSet, but jetSet += ("Boeing", "Airbus") adds two Strings to the set. Finally, the last line prints out whether or not the Set contains a particular String. (As you'd expect, it prints false.)

Another useful collection class in Scala is Maps. As with Sets, Scala provides mutable and immutable versions of Map, using a class hierarchy. As you can see in Figure 3, the class hierarchy for Maps looks a lot like the one for Sets. There's a base Map trait in package scala.collection, and two subtrait Maps: a mutable Map in scala.collection.mutable and an immutable one in scala.collection.immutable.

 



Figure 3. Class hierarchy for Scala Maps.

 

Implementations of Map, such as the HashMaps shown in the class hierarchy in Figure 3, implement either the mutable or immutable trait. To see a Map in action, type the following code into a file named treasure.scala.

// In treasure.scala

import scala.collection.mutable.HashMap

val treasureMap = new HashMap[Int, String]
treasureMap += 1 -> "Go to island."
treasureMap += 2 -> "Find big X on ground."
treasureMap += 3 -> "Dig."
println(treasureMap(2))

On the first line of treasure.scala, you import the mutable form of HashMap. After a blank line, you define a val named treasureMap and initialize it with a new mutable HashMap whose keys will be Ints and values Strings. On the next three lines you add key/value pairs to the HashMap using the -> method. As illustrated in previous examples, the Scala compiler transforms an binary operation expression like 1 -> "Go to island." into 1.->("Go to island."). Thus, when you say 1 -> "Go to island.", you are actually calling a method named -> on an Int with the value 1, and passing in a String with the value "Go to island." This -> method, which you can invoke on any object in a Scala program3, returns a two-element tuple containing the key and value. You then pass this tuple to the += method of the HashMap object to which treasureMap refers. Finally, the last line prints the value that corresponds to the key 2 in the treasureMap. If you run this code, it will print:

Find big X on ground.

Because maps are such a useful programming construct, Scala provides a factory method for Maps that is similar in spirit to the factory method shown in Step 9 that allows you to create Lists without using the new keyword. To try out this more concise way of constructing maps, type the following code into a file called numerals.scala:

// In numerals.scala
val romanNumeral = Map(1 -> "I", 2 -> "II", 3 -> "III", 4 -> "IV", 5 -> "V")
println(romanNumeral(4))

In numerals.scala you take advantage of the fact that the the immutable Map trait is automatically imported into any Scala source file. Thus when you say Map in the first line of code, the Scala interpreter knows you mean scala.collection.immutable.Map. In this line, you call a factory method on the immutable Map's companion object5, passing in five key/value tuples as parameters. This factory method returns an instance of the immutable HashMap containing the passed key/value pairs. The name of the factory method is actually apply, but as mentioned in Step 8, if you say Map(...) it will be transformed by the compiler to Map.apply(...). If you run the numerals.scala script, it will print IV.

Step 11. Understand classes and singleton objects

Up to this point you've written Scala scripts to try out the concepts presented in this article. For all but the simplest projects, however, you will likely want to partition your application code into classes. To give this a try, type the following code into a file called greetSimply.scala:

// In greetSimply.scala

class SimpleGreeter {
  val greeting = "Hello, world!"
  def greet() = println(greeting)
}

val g = new SimpleGreeter
g.greet()

greetSimply.scala is actually a Scala script, but one that contains a class definition. This first, example, however, illustrates that as in Java, classes in Scala encapsulate fields and methods. Fields are defined with either val or var. Methods are defined with def. For example, in class SimpleGreeter, greeting is a field and greet is a method. To use the class, you initialize a val named g with a new instance of SimpleGreeter. You then invoke the greet instance method on g. If you run this script with scala greetSimply.scala, you will be dazzled with yet another Hello, world!.

Although classes in Scala are in many ways similar to Java, in several ways they are quite different. One difference between Java and Scala involves constructors. In Java, classes have constructors, which can take parameters, whereas in Scala, classes can take parameters directly. The Scala notation is more concise—class parameters can be used directly in the body of the class; there’s no need to define fields and write assignments that copy constructor parameters into fields. This can yield substantial savings in boilerplate code; especially for small classes. To see this in action, type the following code into a file named greetFancily.scala:

// In greetFancily.scala

class FancyGreeter(greeting: String) {
  def greet() = println(greeting)
}

val g = new FancyGreeter("Salutations, world")
g.greet

Instead of defining a constructor that takes a String, as you would do in Java, in greetFancily.scala you placed the greeting parameter of that constructor in parentheses placed directly after the name of the class itself, before the open curly brace of the body of class FancyGreeter. When defined in this way, greeting essentially becomes a value (not a variable—it can't be reassigned) field that's available anywhere inside the body. In fact, you pass it to println in the body of the greet method. If you run this script with the command scala greetFancily.scala, it will inspire you with:

Salutations, world!

This is cool and concise, but what if you wanted to check the String passed to FancyGreeter's primary constructor for null, and throw NullPointerException to abort the construction of the new instance? Fortunately, you can. Any code sitting inside the curly braces surrounding the class definition, but which isn't part of a method definition, is compiled into the body of the primary constructor. In essence, the primary constructor will first initialize what is essentially a final field for each parameter in parentheses following the class name. It will then execute any top-level code contained in the class's body. For example, to check a passed parameter for null, type in the following code into a file named greetCarefully.scala:

// In greetCarefully.scala
class CarefulGreeter(greeting: String) {

  if (greeting == null) {
    throw new NullPointerException("greeting was null")
  }

  def greet() = println(greeting)
}

new CarefulGreeter(null)

In greetCarefully.scala, an if statement is sitting smack in the middle of the class body, something that wouldn't compile in Java. The Scala compiler places this if statement into the body of the primary constructor, just after code that initializes what is essentially a final field named greeting with the passed value. Thus, if you pass in null to the primary constructor, as you do in the last line of the greetCarefully.scala script, the primary constructor will first initialize the greeting field to null. Then, it will execute the if statement that checks whether the greeting field is equal to null, and since it is, it will throw a NullPointerException. If you run greetCarefully.scala, you will see a NullPointerException stack trace.

In Java, you sometimes give classes multiple constructors with overloaded parameter lists. You can do that in Scala as well, however you must pick one of them to be the primary constructor, and place those constructor parameters directly after the class name. You then place any additional auxiliary constructors in the body of the class as methods named this. To try this out, type the following code into a file named greetRepeatedly.scala:

// In greetRepeatedly.scala
class RepeatGreeter(greeting: String, count: Int) {

  def this(greeting: String) = this(greeting, 1)

  def greet() = {
    for (i <- 1 to count)
      println(greeting)
  }
}

val g1 = new RepeatGreeter("Hello, world", 3)
g1.greet()
val g2 = new RepeatGreeter("Hi there!")
g2.greet()

RepeatGreeter's primary constructor takes not only a String greeting parameter, but also an Int count of the number of times to print the greeting. However, RepeatGreeter also contains a definition of an auxiliary constructor, the this method that takes a single String greeting parameter. The body of this constructor consists of a single statement: an invocation of the primary constructor parameterized with the passed greeting and a count of 1. In the final four lines of the greetRepeatedly.scala script, you create two RepeatGreeters instances, one using each constructor, and call greet on each. If you run greetRepeatedly.scala, it will print:

Hello, world
Hello, world
Hello, world
Hi there!

Another area in which Scala departs from Java is that you can't have any static fields or methods in a Scala class. Instead, Scala allows you to create singleton objects using the keyword object. A singleton object cannot, and need not, be instantiated with new. It is essentially automatically instantiated the first time it is used, and as the “singleton” in its name implies, there is ever only one instance. A singleton object can share the same name with a class, and when it does, the singleton is called the class's companion object. The Scala compiler transforms the fields and methods of a singleton object to static fields and methods of the resulting binary Java class. To give this a try, type the following code into a file named WorldlyGreeter.scala:

// In WorldlyGreeter.scala

// The WorldlyGreeter class
class WorldlyGreeter(greeting: String) {
  def greet() = {
    val worldlyGreeting = WorldlyGreeter.worldify(greeting)
    println(worldlyGreeting)
  }
}

// The WorldlyGreeter companion object
object WorldlyGreeter {
  def worldify(s: String) = s + ", world!"
}

In this file, you define both a class, with the class keyword, and a companion object, with the object keyword. Both types are named WorldlyGreeter. One way to think about this if you are coming from a Java programming perspective is that any static methods that you would have placed in class WorldlyGreeter in Java, you'd put in singleton object WorldlyGreeter in Scala. In fact, when the Scala compiler generates bytecodes for this file, it will create a Java class named WorldlyGreeter that has an instance method named greet (defined in the WorldlyGreeter class in the Scala source) and a static method named worldify (defined in the WorldlyGreeter companion object in Scala source). Note also that in the first line of the greet method in class WorldlyGreeter, you invoke the singleton object's worldify method using a syntax similar to the way you invoke static methods in Java: the singleton object name, a dot, and the method name:

// Invoking a method on a singleton object from class WorldlyGreeter 
// ...
val worldlyGreeting = WorldlyGreeter.worldify(greeting)
// ...

To run this code, you'll need to create an application. Type the following code into a file named WorldlyApp.scala:

// In WorldlyApp.scala
// A singleton object with a main method that allows
// this singleton object to be run as an application
object WorldlyApp {
  def main(args: Array[String]) {
    val wg = new WorldlyGreeter("Hello")
    wg.greet()
  }
}

Because there's no class named WorldlyApp, this singleton object is not a companion object. It is instead called a stand-alone. object. Thus, a singleton object is either a companion or a stand-alone object. The distinction is important because companion objects get a few special privileges, such as access to private members of the like-named class.

One difference between Scala and Java is that whereas Java requires you to put a public class in a file named after the class—for example, you'd put class SpeedRacer in file SpeedRacer.java—in Scala, you can name .scala files anything you want, no matter what Scala classes or code you put in them. In general in the case of non-scripts, however, it is recommended style to name files after the classes they contain as is done in Java, so that programmers can more easily locate classes by looking at file names. This is the approach we've taken with the two files in this example, WorldlyGreeter.scala and WorldlyApp.scala.

Neither WorldlyGreeter.scala nor WorldlyApp.scala are scripts, because they end in a definition. A script, by contrast, must end in a result expression. Thus if you try to run either of these files as a script, for example by typing:

scala WorldlyGreeter.scala # This won't work!

The Scala interpreter will complain that WorldlyGreeter.scala does not end in a result expression. Instead, you'll need to actually compile these files with the Scala compiler, then run the resulting class files. One way to do this is to use scalac, which is the basic Scala compiler. Simply type:

scalac WorldlyApp.scala WorldlyGreeter.scala

Given that the scalac compiler starts up a new JVM instance each time it is invoked, and that the JVM often has a perceptible start-up delay, the Scala distribution also includes a Scala compiler daemon called fsc (for fast Scala compiler). You use it like this:

fsc WorldlyApp.scala WorldlyGreeter.scala

The first time you run fsc, it will create a local server daemon attached to a port on your computer. It will then send the list of files to compile to the daemon via the port, and the daemon will compile the files. The next time you run fsc, the daemon will already be running, so fsc will simply send the file list to the daemon, which will immediately compile the files. Using fsc, you only need to wait for the the JVM to startup the first time. If you ever want to stop the fsc daemon, you can do so with fsc -shutdown.

Running either of these scalac or fsc commands will produce Java class files that you can then run via the scala command, the same command you used to invoke the interpreter in previous examples. However, instead of giving it a filename with a .scala extension containing Scala code to interpret6 as you did in every previous example, in this case you'll give it the name of a class containing a main method. Similar to Java, any Scala class with a main method that takes a single parameter of type Array[String] and returns Unit7 can serve as the entry point to an application. In this example, WorldlyApp has a main method with the proper signature, so you can run this example by typing:

scala WorldlyApp

At which point you should see:

Hello, world!

You may recall seeing this output previously, but this time it was generated in this interesting manner:

  • The scala program fires up a JVM with the WorldlyApp's main method as the entry point.
  • WordlyApp's main method creates a new WordlyGreeter instance via new, passing in the string "Hello" as a parameter.
  • Class WorldlyGreeter's primary constructor essentially initializes a final field named greeting with the passed value, "Hello" (this initialization code is automatically generated by the Scala compiler).
  • WordlyApp's main method initializes a local \@val@ named wg with the new WorldlyGreeter instance.
  • WordlyApp's main method then invokes greet on the WorldlyGreeter instance to which wg refers.
  • Class WordlyGreeter's greet method invokes worldify on singleton object WorldlyGreeter, passing along the value of the final field greeting, "Hello".
  • Companion object WorldlyGreeter's worldify method returns a String consisting of the value of a concatenation of the s parameter, which is "Hello", and the literal String ", world!".
  • Class WorldlyGreeter's greet method then initializes a \@val@ named worldlyGreetingwithplaces the "Hello, world!" String returned from the worldify method.
  • Class WorldlyGreeter's greet method passes the "Hello, world!" String to which worldlyGreeting refers to println, which sends the cheerful greeting, via the standard output stream, to you.

Step 12. Understand traits and mixins

As first mentioned in Step 10, Scala includes a construct called a trait, which is similar in spirit to Java's interface. One main difference between Java interfaces and Scala's traits are that whereas all methods in Java interfaces are by definition abstract, you can give methods real bodies with real code in Scala traits. Here's an example:

trait Friendly {
  def greet() = "Hi"
}

In this example, the greet method returns the String "Hi". If you are coming from Java, this greet method may look a little funny to you, as if greet() is somehow a field being initialized to the String value "Hi". What is actually going on is that lacking an explicit return statement, Scala methods will return the value of the last expression. In this case, the value of the last expression is "Hi", so that is returned. A more verbose way to say the same thing would be:

trait Friendly {
  def greet(): String = {
    return "Hi"
  }
}

Regardless of how your write the methods, however, the key point is that Scala traits can actually contain non-abstract methods. Another difference between Java interfaces and Scala traits is that whereas you implement Java interfaces, you extend Scala traits. Other than this implements/extends difference, however, inheritance when you are defining a new type works in Scala similarly to Java. In both Java and Scala, a class can extend one (and only one) other class. In Java, an interface can extend zero to many interfaces. Similarly in Scala, a trait can extend zero to many traits. In Java, a class can implement zero to many interfaces. Similarly in Scala, a class can extend zero to many traits. implements is not a keyword in Scala.

Here's an example:

class Dog extends Friendly {
  override def greet() = "Woof"
}

In this example, class Dog extends trait Friendly. This inheritance relationship implies much the same thing as interface implementation does in Java. You can assign a Dog instance to a variable of type Friendly. For example:

var pet: Friendly = new Dog
println(pet.greet())

When you invoke the greet method on the Friendly pet variable, it will use dynamic binding, as in Java, to determine which implementation of the method to call. In this case, class Dog overrides the greet method, so Dog's implementation of greet will be invoked. Were you to execute the above code, you would get Woof (Dog's implementation of greet), not Hi (Friendly's implementation of greet). Note that one difference with Java is that to override a method in Scala, you must precede the method's def with override. If you attempt to override a method without specifying override, your Scala code won't compile.

Finally, one quite significant difference between Java's interfaces and Scala's traits is that in Scala, you can mix in traits at instantiation time. For example, consider the following trait:

trait ExclamatoryGreeter extends Friendly {
  override def greet() = super.greet() + "!"
}

Trait ExclamatoryGreeter extends trait Friendly and overrides the greet method. ExclamatoryGreeter's greet method first invokes the superclass's greet method, appends an exclamation point to whatever the superclass’s greet method returns, and returns the resulting String. With this trait, you can mix in its behavior at instantiation time using the with keyword. Here's an example:

val pup: Friendly = new Dog with ExclamatoryGreeter
println(pup.greet())

Given the initial line of code, the Scala compiler will create a synthetic8 type that extends class Dog and trait ExclamatoryGreeter and instantiate it. When you invoke a method on the synthetic type, it will cause the correct implementation to be invoked. When you run this code, the pup variable will first be initialized with the new instance of the synthetic type, then when greet is invoked on pup, you'll see "Woof!". Note that had pup not been explicitly defined to be of type Friendly, the Scala compiler would have inferred the type of pup to be Dog with ExclamatoryGreeter.

To give all these concepts a try, type the following code into a file named friendly.scala:

trait Friendly {
  def greet() = "Hi"
}

class Dog extends Friendly {
  override def greet() = "Woof"
}

class HungryCat extends Friendly {
  override def greet() = "Meow"
}

class HungryDog extends Dog {
  override def greet() = "I'd like to eat my own dog food"
}

trait ExclamatoryGreeter extends Friendly {
  override def greet() = super.greet() + "!"
}

var pet: Friendly = new Dog
println(pet.greet())

pet = new HungryCat
println(pet.greet())

pet = new HungryDog
println(pet.greet())

pet = new Dog with ExclamatoryGreeter
println(pet.greet())

pet = new HungryCat with ExclamatoryGreeter
println(pet.greet())

pet = new HungryDog with ExclamatoryGreeter
println(pet.greet())

When you run the friendly.scala script, it will print:

Woof
Meow
I'd like to eat my own dog food
Woof!
Meow!
I'd like to eat my own dog food!

Conclusion

As you may have glimpsed by reading this article, the promise of Scala is that you can get more productivity while leveraging existing investments. Scala's basic conciseness of syntax and support for the functional programming style promise increased programmer productivity compared to the Java langauge, while enabling you to continue to take advantage of all the great things about the Java platform. You can complement Java code with Scala code and continue to leverage your existing Java code and APIs, the many APIs available for the Java platform, the runtime performance offered by JVMs, and your own knowledge of the Java platform.

With the knowledge you've gained in this article, you should already be able to get started using Scala for small tasks, especially scripts. In future articles, we will dive into more detail in these topics, and introduce other topics that weren't even hinted at here.