HADOOP 그리고 SPARK !! 이젠 SCALA를 알아야할 때!
We`ve exprienced a littel bit about hadoop until the last blog. I am going to introduce Spark later post.
Before looking around Spark, we need to have some idea about Scala laguage..
Here is simple explanation about Scala language. Today we will explore a little bit.
Step 1. Installation and simple guide.
From Official site, we can download Scala and install it as follows:
Official site of Scala is : https://www.scala-lang.org/
We can download bitecode and exract it. And move it to $ /usr/local/share/scala.
Path and Environment
For quick access, add scala
and scalac
to your path. For example:
Environment | Variable | Value (example) |
---|---|---|
Unix | $SCALA_HOME |
/usr/local/share/scala |
$PATH |
$PATH:$SCALA_HOME/bin |
set the Environment and run some simple code.
Run it interactively! (코딩을 하면서 바로바로 볼수 있는 인터페이스)
The scala
command starts an interactive shell where Scala expressions are interpreted interactively.
> scala
This is a Scala shell.
Type in expressions to have them evaluated.
Type :help for more information.
scala> object HelloWorld {
| def main(args: Array[String]) {
| println("Hello, world!")
| }
| }
defined module HelloWorld
scala> HelloWorld.main(null)
Hello, world!
scala>:q
>
The shortcut :q
stands for the internal shell command :quit
used to exit the interpreter.
Compile it!
The scalac
command compiles one (or more) Scala source file(s) and generates Java bytecode which can be executed on any standard JVM. The Scala compiler works similarly to javac
, the Java compiler of the Java SDK.
> scalac HelloWorld.scala
By default scalac
generates the class files into the current working directory. You may specify a different output directory using the -d
option.
> scalac -d classes HelloWorld.scala
Execute it!
The scala
command executes the generated bytecode with the appropriate options:
> scala HelloWorld
scala
allows us to specify command options, such as the -classpath
(alias -cp
) option:
> scala -cp classes HelloWorld
The argument of the scala
command has to be a top-level object. If that object extends trait App
, then all statements contained in that object will be executed; otherwise you have to add a method main
which will act as the entry point of your program.
Here is how the “Hello, world” example looks like using the App
trait:
object HelloWorld extends App {
println("Hello, world!")
}
Here is more detail about Scala.
Step 2. Learn to use the Scala interpreter
The easiest way to get started with Scala is by using the Scala interpreter, which is an interactive “shell” for writing Scala expressions and programs. Simply type an expression into the interpreter and it will evaluate the expression and print the resulting value. The interactive shell for Scala is simply called scala
. You use it like this:
$ scala This is an interpreter for Scala. Type in expressions to have them evaluated. Type :help for more information. scala>
After you type an expression, such as 1 + 2
, and hit return:
scala> 1 + 2
The interpreter will print:
unnamed0: Int = 3
This line includes:
- an automatically assigned or user-defined name to refer to the computed value (
unnamed0
) - a colon (
:
) - the type of the expression and its resulting value (
Int
) - an equals sign (
=
) - the value resulting from evaluating the expression (
3
)
The type Int
names the class Int
in the package scala
. Values of this class are implemented just like Java's int
values. In fact, Scala treats int
as an alias for scala.Int
. More generally, all of Java's primitive types are defined as aliases for classes in the scala
package. For example, if you type boolean
in a Scala program, the type you'll actually get is scala.Boolean
. Or if you type float
, you'll get scala.Float
. When you compile your Scala code to Java bytecodes, however, Scala will compile these types to Java's primitive types where possible to get the performance benefits of Java's primitive types.
The unnamedX
identifier may be used in later lines. For instance, since unnamed0
was set to 3 previously, unnamed0 * 3
will be 9
:
scala> unnamed0 * 3 unnamed1: Int = 9
To print the necessary, but not sufficient, Hello, world!
greeting, type:
scala> println("Hello, world!") Hello, world! unnamed2: Unit = ()
The type of the result here is scala.Unit
, which is Scala's analogue to void
in Java. The main difference between Scala's Unit
and Java's void
is that Scala lets you write down a value of type Unit
, namely ()
, whereas in Java there is no value of type void
. (In other words, just as 1
, 2
, and 3
, are potential values of type int
in both Scala and Java, ()
is the one and only value of type Unit
in Scala. By contrast, there are no values of type void
in Java.) Except for this, Unit
and void
are equivalent. In particular, every void
-returning method in Java is mapped to a Unit
-returning method in Scala.
Step 3. Define some variables
Scala differentiates between val
s, variables that are assigned once and never change, and var
s, variables that may change over their lifetime. Here's a val
definition:
scala> val msg = "Hello, world!" msg: java.lang.String = Hello, world!
This introduces msg
as a name for the value "Hello world!"
. The type of the value above is java.lang.String
, because Scala strings are also Java strings. (In fact every Java class is available in Scala.)
This example also points out an important and very useful feature of Scala: type inference. Notice that you never said the word java.lang.String
or even String
in the val
definition. The Scala interpreter inferred the val
's type to be the type of its initialization assignment. Since msg
was initialized to "Hello, world!"
, and since "Hello, world!"
is type java.lang.String
, the compiler gave msg
the type java.lang.String
.
When the Scala interpreter (or compiler) can infer a type, it is usually best to let it do so rather than fill the code with unnecessary, explicit type annotations. You can, however, specify a type explicitly if you wish. (For example, you may wish to explicitly specify the types of public members of classes for documentation purposes.) In contrast to Java, where you specify a variable&8217;s type before its name, in Scala you specify a variable's type after its name, separated by a colon. For example:
scala> val msg2: java.lang.String = "Hello again, world!" msg2: java.lang.String = Hello, world!
Or, since java.lang
types are visible with their simple names in Scala programs, simply:
scala> val msg3: String = "Hello yet again, world!" msg3: String = Hello yet again, world!
Going back to our original msg
, now that it is defined, you can then use the msg
value as you'd expect, as in:
scala> println(msg) Hello, world! unnamed3: Unit = ()
What you can't do with msg
, given that it is a val
not a var
, is reassign it. For example, see how the interpreter complains when you attempt the following:
scala> msg = "Goodbye cruel world!":5 error: assignment to non-variable val unnamed4 = {msg = "Goodbye cruel world!";msg}
If reassignment is what you want, you'll need to use a var
, as in:
scala> var greeting = "Hello, world!" greeting: java.lang.String = Hello, world!
Since greeting
is a variable (defined with var
) not a value (defined with val
), you can reassign it later. If you are feeling grouchy later, for example, you could change your greeting
to:
scala> greeting = "Leave me alone, world!" greeting: java.lang.String = Leave me alone, world!
Step 4. Define some methods
Now that you've worked with Scala variables, you'll probably want to write some methods. Here's how you do that in Scala:
scala> def max(x: Int, y: Int): Int = if (x < y) y else x max: (Int,Int)Int
Method definitions start with def
instead of val
or var
. The method's name, in this case max
, is followed by a list of parameters in parentheses. A type annotation must follow every method parameter, preceded by a colon in the Scala way, because the Scala compiler (and interpreter, but from now on we'll just say compiler) does not infer method parameter types. In this example, the method named max
takes two parameters, x
and y
, both of type Int
. After the close parenthesis of max
's parameter list you'll find another “: Int
” type specifier. This one defines the result type of the max
method itself.
Sometimes the Scala compiler will require you to specify the result type of a method. If the method is recursive1, for example, you must explicitly specify the method result type. In the case of max
however, you may leave the result type specifier off and the compiler will infer it. Thus, the max
method could have been written:
scala> def max2(x: Int, y: Int) = if (x < y) y else x max2: (Int,Int)Int
Note that you must always explicitly specify a method's parameter types regardless of whether you explicitly specify its result type.
The name, parameters list, and result type, if specified, form a method's signature. After the method's signature you must put an equals sign and then the body of the method. Since max
's body consists of just one statement, you need not place it inside curly braces, but you can if you want. So you could also have written:
scala> def max3(x: Int, y: Int) = { if (x < y) y else x } max3: (Int,Int)Int
If you want to put more than one statement in the body of a method, you must enclose them inside curly braces.
Once you have defined a method, you can call it by name, as in:
scala> max(3, 5) unnamed6: Int = 5
Note that if a method takes no parameters, as in:
scala> def greet() = println("Hello, world!") greet: ()Unit
You can call it with or without parentheses:
scala> greet() Hello, world! unnamed7: Unit = () scala> greet Hello, world! unnamed8: Unit = ()
The recommended style guideline for such method invocations is that if the method may have side effects4, you should provide the parentheses even if the compiler doesn't require them. Thus in this case, since the greet
method prints to the standard output, it has side effects and you should invoke it with parentheses to alert programmers looking at the code.
Step 5. Write some Scala scripts
Although Scala is designed to help developers build large systems, it also scales down nicely such that it feels natural to write scripts in it. A script is just a sequence of statements in a file that will be executed sequentially. (By the way, if you're still running the scala interpreter, you can exit it by entering the :quit
command.) Put this into a file named hello.scala
:
println("Hello, world, from a script!")
then run:
>scala hello.scala
And you should get yet another greeting:
Hello, world, from a script!
Command line arguments to a Scala script are available via a Scala array named args
. In Scala, arrays are zero based, as in Java, but you access an element by specifying an index in parentheses rather than square brackets. So the first element in a Scala array named steps
is steps(0)
, not steps[0]
. To try this out, type the following into a new file named helloarg.scala
:
// Say hello to the first argument println("Hello, " + args(0) + "!")
then run:
>scala helloarg.scala planet
In this command, "planet"
is passed as a command line argument, which is accessed in the script as args(0)
. Thus, you should see:
Hello, planet!
Note also that this script included a comment. As with Java, the Scala compiler will ignore characters between //
and the next end of line, as well as any characters between /*
and */
. This example also shows strings being concatenated with the +
operator. This works as you'd expect. The expression "Hello, " + "world!"
will result in the string "Hello, world!"
.
By the way, if you're on some flavor of Unix, you can run a Scala script as a shell script by prepending a “pound bang” directive at the top of the file. For example, type the following into a file named helloarg
:
#!/bin/sh exec scala $0 $@ !# // Say hello to the first argument println("Hello, " + args(0) + "!")
The initial #!/bin/sh
must be the very first line in the file. Once you set its execute permission:
>chmod +x helloarg
You can run the Scala script as a shell script by simply saying:
>./helloarg globe
Which should yield:
Hello, globe!
Step 6. Loop with while
, decide with if
You write while
loops in Scala in much the same way you do in Java. Try out a while
by typing the following into a file name printargs.scala
:
var i = 0 while (i < args.length) { println(args(i)) i += 1 }
This script starts with a variable definition, var i = 0
. Type inference gives i
the type scala.Int
, because that is the type of its initial value, 0
. The while
construct on the next line causes the block (the code between the curly braces) to be repeatedly executed until the boolean expression i < args.length
is false. args.length
gives the length of the args
array, similar to the way you get the length of an array in Java. The block contains two statements, each indented two spaces, the recommended indentation style for Scala. The first statement, println(args(i))
, prints out the i
th command line argument. The second statement, i += 1
, increments i
by one. Note that Java's ++i
and i++
don't work in Scala. To increment in Scala, you need to say either i = i + 1
or i += 1
. Run this script with the following command:
>scala printargs.scala Scala is fun
And you should see:
Scala is fun
For even more fun, type the following code into a new file named echoargs.scala
:
var i = 0 while (i < args.length) { if (i != 0) print(" ") print(args(i)) i += 1 } println()
In this version, you've replaced the println
call with a print
call, so that all the arguments will be printed out on the same line. To make this readable, you've inserted a single space before each argument except the first via the if (i != 0)
construct. Since i != 0
will be false
the first time through the while
loop, no space will get printed before the initial argument. Lastly, you've added one more println
to the end, to get a line return after printing out all the arguments. Your output will be very pretty indeed.
If you run this script with the following command:
>scala echoargs.scala Scala is even more fun
You'll get:
Scala is even more fun
Note that in Scala, as in Java, you must put the boolean expression for a while
or an if
in parentheses. (In other words, you can't say in Scala things like if i < 10
as you can in a language such as Ruby. You must say if (i < 10)
in Scala.) Another similarity to Java is that if a block has only one statement, you can optionally leave off the curly braces, as demonstrated by the if
statement in echoargs.scala
. And although you haven't seen many of them, Scala does use semi-colons to separate statements as in Java, except that in Scala the semi-colons are very often optional, giving some welcome relief to your right pinky finger. If you had been in a more verbose mood, therefore, you could have written the echoargs.scala
script as follows:
var i = 0; while (i < args.length) { if (i != 0) { print(" "); } print(args(i)); i += 1; } println();
If you type the previous code into a new file named echoargsverbosely.scala
, and run it with the command:
> scala echoargsverbosely.scala In Scala semicolons are often optional
You should see the output:
In Scala semicolons are often optional
Note that because you had no parameters to pass to the println
method, you could have left off the parentheses and the compiler would have been perfectly happy. But given the style guideline that you should always use parentheses when calling methods that may have side effects—coupled with the fact that by printing to the standard output, println
will indeed have side effects—you specified the parentheses even in the concise echoargs.scala
version.
One of the benefits of Scala that you can begin to see with these examples, is that Scala gives you the conciseness of a scripting language such as Ruby or Python, but without requiring you to give up the static type checking of more verbose languages like Java or C++. Scala's conciseness comes not only from its ability to infer both types and semicolons, but also its support for the functional programming style, which is discussed in the next step.
Step 7. Iterate with foreach
and for
Although you may not have realized it, when you wrote the while loops in the previous step, you were programming in an imperative style. In the imperative style, which is the style you would ordinarily use with languages like Java, C++, and C, you give one imperative command at a time, iterate with loops, and often mutate state shared between different functions or methods. Scala enables you to program imperatively, but as you get to know Scala better, you'll likely often find yourself programming in a more functional style. In fact, one of the main aims of the Scalazine will be to help you become as competent at functional programming as you are at imperative programming, using Scala as a vehicle.
One of the main characteristics of a functional language is that functions are first class constructs, and that's very true in Scala. For example, another (far more concise) way to print each command line argument is:
args.foreach(arg => println(arg))
In this code, you call the foreach
method on args
, and pass in a function. In this case, you're passing in an anonymous function (one with no name), which takes one parameter named arg
. The code of the anonymous function is println(arg)
. If you type the above code into a new file named pa.scala
, and execute with the command:
scala pa.scala Concise is nice
You should see:
Concise is nice
In the previous example, the Scala interpreter infers type of arg
to be String
, since String
s are what the array on which you're calling foreach
is holding. If you'd prefer to be more explicit, you can mention the type name, but when you do you'll need to wrap the argument portion in parentheses (which is the normal form of the syntax anyway). Try typing this into a file named epa.scala
.
args.foreach((arg: String) => println(arg))
Running this script has the same behavior as the previous one. With the command:
scala epa.scala Explicit can be nice too
You'll get:
Explicit can be nice too
If instead of an explicit mood, you're in the mood for even more conciseness, you can take advantage of a special case in Scala. If an anonymous function consists of one method application that takes a single argument, you need not explicitly name and specify the argument. Thus, the following code also works:
args.foreach(println)
To summarize, the syntax for an anonymous function is a list of named parameters, in parentheses, a right arrow, and then the body of the function. This syntax is illustrated in Figure 1.
Figure 1. The syntax of a Scala anonymous function.
Now, by this point you may be wondering what happened to those trusty for loops you have been accustomed to using in imperative languages such as Java. In an effort to guide you in a functional direction, only a functional relative of the imperative for
(called a for comprehension) is available in Scala. While you won't see their full power and expressiveness in this article, we will give you a glimpse. In a new file named forprintargs.scala
, type the following:
for (arg <- args) println(arg)
The parentheses after the for
in this for comprehension contain arg <- args
. To the left of the <-
symbol, which you can say as “in”, is a declaration of a new \@val@ (not a \@var@) named arg
. To the right of <-
is the familiar args
array. When this code executes, arg
will be assigned to each element of the args
array and the body of the for, println(arg)
, will be executed. Scala's for comprehensions can do much more than this, but this simple form is similar in functionality to Java 5's:
// ... for (String arg : args) { // Remember, this is Java, not Scala System.out.println(arg); } // ...
or Ruby's
for arg in ARGV # Remember, this is Ruby, not Scala puts arg end
When you run the forprintargs.scala
script with the command:
scala forprintargs.scala for is functional
You should see:
for is functional
Step 8. Parameterize Array
s with types
In addition to being functional, Scala is object-oriented. In Scala, as in Java, you define a blueprint for objects with classes. From a class blueprint, you can instantiate objects, or class instances, by using new
. For example, the following Scala code instantiates a new String
and prints it out:
val s = new String("Hello, world!") println(s)
In the previous example, you parameterize the String
instance with the initial value "Hello, world!"
. You can think of parameterization as meaning configuring an instance at the point in your program that you create that instance. You configure an instance with values by passing objects to a constructor of the instance in parentheses, just like you do when you create an instance in Java. If you place the previous code in a new file named paramwithvalues.scala
and run it with scala paramswithvalues.scala
, you'll see the familiar Hello, world!
greeting printed out.
In addition to parameterizing instances with values at the point of instantiation, you can in Scala also parameterize them with types. This kind of parameterization is akin to specifying a type in angle brackets when instantiating a generic type in Java 5 and beyond. The main difference is that instead of the angle brackets used for this purpose in Java, in Scala you use square brackets. Here's an example:
val greetStrings = new Array[String](3) greetStrings(0) = "Hello" greetStrings(1) = ", " greetStrings(2) = "world!\n" for (i <- 0 to 2) print(greetStrings(i))
In this example, greetStrings
is a value of type Array[String]
(say this as, “an array of string”) that is initialized to length 3 by passing the value 3
to a constructor in parentheses in the first line of code. Type this code into a new file called paramwithtypes.scala
and execute it with scala paramwithtypes.scala
, and you'll see yet another Hello, world!
greeting. Note that when you parameterize an instance with both a type and a value, the type comes first in its square brackets, followed by the value in parentheses.
Had you been in a more explicit mood, you could have specified the type of greetStrings
explicitly like this:
val greetStrings: Array[String] = new Array[String](3) // ...
Given Scala's type inference, this line of code is semantically equivalent to the actual first line of code in paramwithtypes.scala
. But this form demonstrates that while the type parameterization portion (the type names in square brackets) form part of the type of the instance, the value parameterization part (the values in parentheses) do not. The type of greetStrings
is Array[String]
, not Array[String](3)
.
The next three lines of code in paramwithtypes.scala
initializes each element of the greetStrings
array:
// ... greetStrings(0) = "Hello" greetStrings(1) = ", " greetStrings(2) = "world!\n" // ...
As mentioned previously, arrays in Scala are accessed by placing the index inside parentheses, not square brackets as in Java. Thus the zeroeth element of the array is greetStrings(0)
, not greetStrings[0]
as in Java.
These three lines of code illustrate an important concept to understand about Scala concerning the meaning of val
. When you define a variable with val
, the variable can't be reassigned, but the object to which it refers could potentially still be mutated. So in this case, you couldn't reassign greetStrings
to a different array; greetStrings
will always point to the same Array[String]
instance with which it was initialized. But you can change the elements of that Array[String]
over time, so the array itself is mutable.
The final two lines in paramwithtypes.scala
contain a for comprehension that prints out each greetStrings
array element in turn.
// ... for (i <- 0 to 2) print(greetStrings(i))
The first line of code in this for comprehension illustrates another general rule of Scala: if a method takes only one parameter, you can call it without a dot or parentheses. to
is actually a method that takes one Int
argument. The code 0 to 2
is transformed into the method call 0.to(2)
. (This to
method actually returns not an Array
but a Scala iterator that returns the values 0, 1, and 2.) Scala doesn't technically have operator overloading, because it doesn't actually have operators in the traditional sense. Characters such as +
, -
, *
, and /
, have no special meaning in Scala, but they can be used in method names. Thus, the expression 1 + 2
, which was the first Scala code you typed into the interpreter in Step 1, is essential in meaning to 1.+(2)
, where +
is the name of a method defined in class scala.Int
.
Another important idea illustrated by this example will give you insight into why arrays are accessed with parentheses in Scala. Scala has fewer special cases than Java. Arrays are simply instances of classes like any other class in Scala. When you apply parentheses to a variable and pass in some arguments, Scala will transform that into an invocation of a method named apply
. So greetStrings(i)
gets transformed into greetStrings.apply(i)
. Thus accessing the element of an array in Scala is simply a method call like any other method call. What's more, the compiler will transform any application of parentheses with some arguments on any type into an apply method call, not just arrays. Of course it will compile only if that type actually defines an apply
method. So it's not a special case; it's a general rule.
Similarly, when an assignment is made to a variable that is followed by some arguments in parentheses, the compiler will transform that into an invocation of an update
method that takes two parameters. For example,
greetStrings(0) = "Hello"
will essentially be transformed into
greetStrings.update(0, "Hello")
Thus, the following Scala code is semantically equivalent to the code you typed into paramwithtypes.scala
:
val greetStrings = new Array[String](3) greetStrings.update(0, "Hello") greetStrings.update(1, ", ") greetStrings.update(2, "world!\n") for (i <- 0.to(2)) print(greetStrings.apply(i))
Scala achieves a conceptual simplicity by treating everything, from arrays to expressions, as objects with methods. You as the programmer don't have to remember lots of special cases, such as the differences in Java between primitive and their corresponding wrapper types, or between arrays and regular objects. However, it is significant to note that in Scala this uniformity does not usually come with a performance cost as it often has in other languages that have aimed to be pure in their object orientation. The Scala compiler uses Java arrays, primitive types, and native arithmetic where possible in the compiled code. Thus Scala really does give you the best of both worlds in this sense: the conceptual simplicity of a pure object-oriented language with the runtime performance characteristics of language that has special cases for performance reasons.
Step 9. Use List
s and Tuple
s
One of the big ideas of the functional style of programming is that methods should not have side effects. The only effect of a method should be to compute the value or values that are returned by the method. Some benefits gained when you take this approach are that methods become less entangled, and therefore more reliable and reusable. Another benefit of the functional style in a statically typed language is that everything that goes into and out of a method is checked by a type checker, so logic errors are more likely to manifest themselves as type errors. To apply this functional philosophy to the world of objects, you would make objects immutable. A simple example of an immutable object in Java is String
. If you create a String
with the value "Hello, "
, it will keep that value for the rest of its lifetime. If you later call concat("world!")
on that String
, it will not add "world!"
to itself. Instead, it will create and return a brand new String
with the value Hello, world!"
.
As you've seen, a Scala Array
is a mutable sequence of objects that all share the same type. An Array[String]
contains only String
s, for example. Although you can't change the length of an Array
after it is instantiated, you can change its element values. Thus, Array
s are mutable objects. An immutable, and therefore more functional-oriented, sequence of objects that share the same type is Scala's List
. As with Array
s, a List[String]
contains only String
s. Scala's List
, scala.List
, differs from Java's java.util.List
type in that Scala List
s are always immutable (whereas Java List
s can be mutable). But more importantly, Scala's List
is designed to enable a functional style of programming. Creating a List
is easy, you just say:
val oneTwoThree = List(1, 2, 3)
This establishes a new \@val@ named oneTwoThree
, which initialized with a new List[Int]
with the integer element values 1, 2 and 3. (You don't need to say new List
because “List
” is defined as a factory method on the scala.List
singleton object. More on Scala's singleton object construct in Step 11.) Because List
s are immutable, they behave a bit like Java String
s in that when you call a method on one that might seem by its name to imply the List
will be mutated, it instead creates a new List
with the new value and returns it. For example, List
has a method named :::
that concatenates a passed List
and the List
on which :::
was invoked. Here's how you use it:
val oneTwo = List(1, 2) val threeFour = List(3, 4) val oneTwoThreeFour = oneTwo ::: threeFour println(oneTwo + " and " + threeFour + " were not mutated.") println("Thus, " + oneTwoThreeFour + " is a new List.")
Type this code into a new file called listcat.scala
and execute it with scala listcat.scala
, and you should see:
List(1, 2) and List(3, 4) were not mutated. Thus, List(1, 2, 3, 4) is a new List.
Enough said.2
Perhaps the most common operator you'll use with List
s is ::
, which is pronounced “cons.” Cons prepends a new element to the beginning of an existing List
, and returns the resulting List
. For example, if you type the following code into a file named consit.scala
:
val twoThree = List(2, 3) val oneTwoThree = 1 :: twoThree println(oneTwoThree)
And execute it with scala consit.scala
, you should see:
List(1, 2, 3)
Given that a shorthand way to specify an empty List
is Nil
, one way to initialize new List
s is to string together elements with the cons operator, with Nil
as the last element. For example, if you type the following code into a file named consinit.scala
:
val oneTwoThree = 1 :: 2 :: 3 :: Nil println(oneTwoThree)
And execute it with scala consinit.scala
, you should again see:
List(1, 2, 3)
Scala's List
is packed with useful methods, many of which are shown in Table 1.
What it Is | What it Does |
---|---|
List() |
Creates an empty List |
Nil |
Creates an empty List |
List("Cool", "tools", "rule") |
Creates a new List[String] with the three values "Cool" , "tools" , and "rule" |
val thrill = "Will" :: "fill" :: "until" :: Nil |
Creates a new List[String] with the three values "Will" , "fill" , and "until" |
thrill(2) |
Returns the 2nd element (zero based) of the thrill List (returns "until" ) |
thrill.count(s => s.length == 4) |
Counts the number of String elements in thrill that have length 4 (returns 2) |
thrill.drop(2) |
Returns the thrill List without its first 2 elements (returns List("until") ) |
thrill.dropRight(2) |
Returns the thrill List without its rightmost 2 elements (returns List("Will") ) |
thrill.exists(s => s == "until") |
Determines whether a String element exists in thrill that has the value "until" (returns true ) |
thrill.filter(s => s.length == 4) |
Returns a List of all elements, in order, of the thrill List that have length 4 (returns List("Will", "fill") ) |
thrill.forall(s => s.endsWith("l")) |
Indicates whether all elements in the thrill List end with the letter "l" (returns true ) |
thrill.foreach(s => print(s)) |
Executes the print statement on each of the String s in the thrill List (prints "Willfilluntil" ) |
thrill.foreach(print) |
Same as the previous, but more concise (also prints "Willfilluntil" ) |
thrill.head |
Returns the first element in the thrill List (returns "Will" ) |
thrill.init |
Returns a List of all but the last element in the thrill List (returns List("Will", "fill") ) |
thrill.isEmpty |
Indicates whether the thrill List is empty (returns false ) |
thrill.last |
Returns the last element in the thrill List (returns "until" ) |
thrill.length |
Returns the number of elements in the thrill List (returns 3) |
thrill.map(s => s + "y") |
Returns a List resulting from adding a "y" to each String element in the thrill List (returns List("Willy", "filly", "untily") ) |
thrill.remove(s => s.length == 4) |
Returns a List of all elements, in order, of the thrill List except those that have length 4 (returns List("until") ) |
thrill.reverse |
Returns a List containing all element of the thrill List in reverse order (returns List("until", "fill", "Will") ) |
thrill.sort((s, t) => s.charAt(0).toLowerCase < t.charAt(0).toLowerCase) |
Returns a List containing all element of the thrill List in alphabetical order of the first character lowercased (returns List("fill", "until", "Will") ) |
thrill.tail |
Returns the thrill List minus its first element (returns List("fill", "until") ) |
Table 1. Some List
methods and usages.
Besides List
, one other ordered collection of object elements that's very useful in Scala is the tuple. Like List
s, tuples are immutable, but unlike List
s, tuples can contain different types of elements. Thus whereas a list might be a List[Int]
or a List[String]
, a tuple could contain both an Int
and a String
at the same time. Tuples are very useful, for example, if you need to return multiple objects from a method. Whereas in Java, you would often create a JavaBean-like class to hold the multiple return values, in Scala you can simply return a tuple. And it is simple: to instantiate a new tuple that holds some objects, just place the objects in parentheses, separated by commas. Once you have a tuple instantiated, you can access its elements individually with a dot, underscore, and the one-based index of the element. For example, type the following code into a file named luftballons.scala
:
val pair = (99, "Luftballons") println(pair._1) println(pair._2)
In the first line of this code, you create a new tuple that contains an Int
with the value 99 as its first element, and a String
with the value "Luftballons"
as its second element. Scala infers the type of the tuple to be Tuple2[Int, String]
, and gives that type to the variable pair
as well. In the second line, you access the _1
field, which will produce the first element, 99. The .
in the second line is the same dot you'd use to access a field or invoke a method. In this case you are accessing a field named _1
. If you run this script with scala luftballons.scala
, you'll see:
99 Luftballons
The actual type of a tuple depends upon the number and of elements it contains and the types of those elements. Thus, the type of (99, "Luftballons")
is Tuple2[Int, String]
. The type of ('u', 'r', "the", 1, 4, "me")
is Tuple6[Char, Char, String, Int, Int, String]
.
Step 10. Use Set
s and Map
s
Because Scala aims to help you take advantage of both functional and imperative styles, its collections libraries make a point to differentiate between mutable and immutable collection classes. For example, Array
s are always mutable, whereas List
s are always immutable. When it comes to Set
s and Map
s, Scala also provides mutable and immutable alternatives, but in a different way. For Set
s and Map
s, Scala models mutability in the class hierarchy.
For example, the Scala API contains a base trait for Set
s, where a trait is similar to a Java interface
. (You'll find out more about trait
s in Step 12.) Scala then provides two subtraits, one for mutable Set
s, and another for immutable Set
s. As you can see in Figure 2, these three traits all share the same simple name, Set
. Their fully qualified names differ, however, because they each reside in a different package. Concrete Set
classes in the Scala API, such as the HashSet
classes shown in Figure 2, extend either the mutable or immutable Set
trait. (Although in Java you implement interface
s, in Scala you “extend” traits.) Thus, if you want to use a HashSet
, you can choose between mutable and immutable varieties depending upon your needs.
Figure 2. Class hierarchy for Scala
Set
s.
To try out Scala Set
s, type the following code into a file named jetset.scala
:
import scala.collection.mutable.HashSet val jetSet = new HashSet[String] jetSet += "Lear" jetSet += ("Boeing", "Airbus") println(jetSet.contains("Cessna"))
The first line of jetSet.scala
imports the mutable HashSet
. As with Java, the import allows you to use the simple name of the class, HashSet
, in this source file. After a blank line, the third line initializes jetSet
with a new HashSet
that will contain only String
s. Note that just as with List
s and Array
s, when you create a Set
, you need to parameterize it with a type (in this case, String
), since every object in a Set
must share the same type. The subsequent two lines add three objects to the mutable Set
via the +=
method. As with most other symbols you've seen that look like operators in Scala, +=
is actually a method defined on class HashSet
. Had you wanted to, instead of writing jetSet += "Lear"
, you could have written jetSet.+=("Lear")
. Because the +=
method takes a variable number of arguments, you can pass one or more objects at a time to it. For example, jetSet += "Lear"
adds one String
to the HashSet
, but jetSet += ("Boeing", "Airbus")
adds two Strings
to the set. Finally, the last line prints out whether or not the Set
contains a particular String
. (As you'd expect, it prints false
.)
Another useful collection class in Scala is Map
s. As with Set
s, Scala provides mutable and immutable versions of Map
, using a class hierarchy. As you can see in Figure 3, the class hierarchy for Map
s looks a lot like the one for Set
s. There's a base Map
trait in package scala.collection
, and two subtrait Map
s: a mutable Map
in scala.collection.mutable
and an immutable one in scala.collection.immutable
.
Figure 3. Class hierarchy for Scala
Map
s.
Implementations of Map
, such as the HashMap
s shown in the class hierarchy in Figure 3, implement either the mutable or immutable trait. To see a Map
in action, type the following code into a file named treasure.scala
.
// In treasure.scala import scala.collection.mutable.HashMap val treasureMap = new HashMap[Int, String] treasureMap += 1 -> "Go to island." treasureMap += 2 -> "Find big X on ground." treasureMap += 3 -> "Dig." println(treasureMap(2))
On the first line of treasure.scala
, you import the mutable form of HashMap
. After a blank line, you define a val
named treasureMap
and initialize it with a new mutable HashMap
whose keys will be Int
s and values String
s. On the next three lines you add key/value pairs to the HashMap
using the ->
method. As illustrated in previous examples, the Scala compiler transforms an binary operation expression like 1 -> "Go to island."
into 1.->("Go to island.")
. Thus, when you say 1 -> "Go to island."
, you are actually calling a method named ->
on an Int
with the value 1, and passing in a String
with the value "Go to island."
This ->
method, which you can invoke on any object in a Scala program3, returns a two-element tuple containing the key and value. You then pass this tuple to the +=
method of the HashMap
object to which treasureMap
refers. Finally, the last line prints the value that corresponds to the key 2
in the treasureMap
. If you run this code, it will print:
Find big X on ground.
Because maps are such a useful programming construct, Scala provides a factory method for Map
s that is similar in spirit to the factory method shown in Step 9 that allows you to create List
s without using the new
keyword. To try out this more concise way of constructing maps, type the following code into a file called numerals.scala
:
// In numerals.scala val romanNumeral = Map(1 -> "I", 2 -> "II", 3 -> "III", 4 -> "IV", 5 -> "V") println(romanNumeral(4))
In numerals.scala
you take advantage of the fact that the the immutable Map
trait is automatically imported into any Scala source file. Thus when you say Map
in the first line of code, the Scala interpreter knows you mean scala.collection.immutable.Map
. In this line, you call a factory method on the immutable Map
's companion object5, passing in five key/value tuples as parameters. This factory method returns an instance of the immutable HashMap
containing the passed key/value pairs. The name of the factory method is actually apply
, but as mentioned in Step 8, if you say Map(...)
it will be transformed by the compiler to Map.apply(...)
. If you run the numerals.scala
script, it will print IV
.
Step 11. Understand classes and singleton objects
Up to this point you've written Scala scripts to try out the concepts presented in this article. For all but the simplest projects, however, you will likely want to partition your application code into classes. To give this a try, type the following code into a file called greetSimply.scala
:
// In greetSimply.scala class SimpleGreeter { val greeting = "Hello, world!" def greet() = println(greeting) } val g = new SimpleGreeter g.greet()
greetSimply.scala
is actually a Scala script, but one that contains a class definition. This first, example, however, illustrates that as in Java, classes in Scala encapsulate fields and methods. Fields are defined with either val
or var
. Methods are defined with def
. For example, in class SimpleGreeter
, greeting
is a field and greet
is a method. To use the class, you initialize a val
named g
with a new instance of SimpleGreeter
. You then invoke the greet
instance method on g
. If you run this script with scala greetSimply.scala
, you will be dazzled with yet another Hello, world!
.
Although classes in Scala are in many ways similar to Java, in several ways they are quite different. One difference between Java and Scala involves constructors. In Java, classes have constructors, which can take parameters, whereas in Scala, classes can take parameters directly. The Scala notation is more concise—class parameters can be used directly in the body of the class; there’s no need to define fields and write assignments that copy constructor parameters into fields. This can yield substantial savings in boilerplate code; especially for small classes. To see this in action, type the following code into a file named greetFancily.scala
:
// In greetFancily.scala class FancyGreeter(greeting: String) { def greet() = println(greeting) } val g = new FancyGreeter("Salutations, world") g.greet
Instead of defining a constructor that takes a String
, as you would do in Java, in greetFancily.scala
you placed the greeting
parameter of that constructor in parentheses placed directly after the name of the class itself, before the open curly brace of the body of class FancyGreeter
. When defined in this way, greeting
essentially becomes a value (not a variable—it can't be reassigned) field that's available anywhere inside the body. In fact, you pass it to println
in the body of the greet
method. If you run this script with the command scala greetFancily.scala
, it will inspire you with:
Salutations, world!
This is cool and concise, but what if you wanted to check the String
passed to FancyGreeter
's primary constructor for null
, and throw NullPointerException
to abort the construction of the new instance? Fortunately, you can. Any code sitting inside the curly braces surrounding the class definition, but which isn't part of a method definition, is compiled into the body of the primary constructor. In essence, the primary constructor will first initialize what is essentially a final field for each parameter in parentheses following the class name. It will then execute any top-level code contained in the class's body. For example, to check a passed parameter for null
, type in the following code into a file named greetCarefully.scala
:
// In greetCarefully.scala class CarefulGreeter(greeting: String) { if (greeting == null) { throw new NullPointerException("greeting was null") } def greet() = println(greeting) } new CarefulGreeter(null)
In greetCarefully.scala
, an if
statement is sitting smack in the middle of the class body, something that wouldn't compile in Java. The Scala compiler places this if
statement into the body of the primary constructor, just after code that initializes what is essentially a final field named greeting
with the passed value. Thus, if you pass in null
to the primary constructor, as you do in the last line of the greetCarefully.scala
script, the primary constructor will first initialize the greeting
field to null
. Then, it will execute the if
statement that checks whether the greeting
field is equal to null
, and since it is, it will throw a NullPointerException
. If you run greetCarefully.scala
, you will see a NullPointerException
stack trace.
In Java, you sometimes give classes multiple constructors with overloaded parameter lists. You can do that in Scala as well, however you must pick one of them to be the primary constructor, and place those constructor parameters directly after the class name. You then place any additional auxiliary constructors in the body of the class as methods named this
. To try this out, type the following code into a file named greetRepeatedly.scala
:
// In greetRepeatedly.scala class RepeatGreeter(greeting: String, count: Int) { def this(greeting: String) = this(greeting, 1) def greet() = { for (i <- 1 to count) println(greeting) } } val g1 = new RepeatGreeter("Hello, world", 3) g1.greet() val g2 = new RepeatGreeter("Hi there!") g2.greet()
RepeatGreeter
's primary constructor takes not only a String
greeting
parameter, but also an Int
count of the number of times to print the greeting. However, RepeatGreeter
also contains a definition of an auxiliary constructor, the this
method that takes a single String
greeting
parameter. The body of this constructor consists of a single statement: an invocation of the primary constructor parameterized with the passed greeting
and a count of 1. In the final four lines of the greetRepeatedly.scala
script, you create two RepeatGreeter
s instances, one using each constructor, and call greet
on each. If you run greetRepeatedly.scala
, it will print:
Hello, world Hello, world Hello, world Hi there!
Another area in which Scala departs from Java is that you can't have any static fields or methods in a Scala class. Instead, Scala allows you to create singleton objects using the keyword object
. A singleton object cannot, and need not, be instantiated with new
. It is essentially automatically instantiated the first time it is used, and as the “singleton” in its name implies, there is ever only one instance. A singleton object can share the same name with a class, and when it does, the singleton is called the class's companion object. The Scala compiler transforms the fields and methods of a singleton object to static fields and methods of the resulting binary Java class. To give this a try, type the following code into a file named WorldlyGreeter.scala
:
// In WorldlyGreeter.scala // The WorldlyGreeter class class WorldlyGreeter(greeting: String) { def greet() = { val worldlyGreeting = WorldlyGreeter.worldify(greeting) println(worldlyGreeting) } } // The WorldlyGreeter companion object object WorldlyGreeter { def worldify(s: String) = s + ", world!" }
In this file, you define both a class, with the class
keyword, and a companion object, with the object
keyword. Both types are named WorldlyGreeter
. One way to think about this if you are coming from a Java programming perspective is that any static methods that you would have placed in class WorldlyGreeter
in Java, you'd put in singleton object WorldlyGreeter
in Scala. In fact, when the Scala compiler generates bytecodes for this file, it will create a Java class named WorldlyGreeter
that has an instance method named greet
(defined in the WorldlyGreeter
class in the Scala source) and a static method named worldify
(defined in the WorldlyGreeter
companion object in Scala source). Note also that in the first line of the greet
method in class WorldlyGreeter
, you invoke the singleton object's worldify
method using a syntax similar to the way you invoke static methods in Java: the singleton object name, a dot, and the method name:
// Invoking a method on a singleton object from class WorldlyGreeter // ... val worldlyGreeting = WorldlyGreeter.worldify(greeting) // ...
To run this code, you'll need to create an application. Type the following code into a file named WorldlyApp.scala
:
// In WorldlyApp.scala // A singleton object with a main method that allows // this singleton object to be run as an application object WorldlyApp { def main(args: Array[String]) { val wg = new WorldlyGreeter("Hello") wg.greet() } }
Because there's no class named WorldlyApp
, this singleton object is not a companion object. It is instead called a stand-alone. object. Thus, a singleton object is either a companion or a stand-alone object. The distinction is important because companion objects get a few special privileges, such as access to private members of the like-named class.
One difference between Scala and Java is that whereas Java requires you to put a public class in a file named after the class—for example, you'd put class SpeedRacer
in file SpeedRacer.java
—in Scala, you can name .scala
files anything you want, no matter what Scala classes or code you put in them. In general in the case of non-scripts, however, it is recommended style to name files after the classes they contain as is done in Java, so that programmers can more easily locate classes by looking at file names. This is the approach we've taken with the two files in this example, WorldlyGreeter.scala
and WorldlyApp.scala
.
Neither WorldlyGreeter.scala
nor WorldlyApp.scala
are scripts, because they end in a definition. A script, by contrast, must end in a result expression. Thus if you try to run either of these files as a script, for example by typing:
scala WorldlyGreeter.scala # This won't work!
The Scala interpreter will complain that WorldlyGreeter.scala
does not end in a result expression. Instead, you'll need to actually compile these files with the Scala compiler, then run the resulting class files. One way to do this is to use scalac
, which is the basic Scala compiler. Simply type:
scalac WorldlyApp.scala WorldlyGreeter.scala
Given that the scalac
compiler starts up a new JVM instance each time it is invoked, and that the JVM often has a perceptible start-up delay, the Scala distribution also includes a Scala compiler daemon called fsc
(for fast Scala compiler). You use it like this:
fsc WorldlyApp.scala WorldlyGreeter.scala
The first time you run fsc
, it will create a local server daemon attached to a port on your computer. It will then send the list of files to compile to the daemon via the port, and the daemon will compile the files. The next time you run fsc
, the daemon will already be running, so fsc
will simply send the file list to the daemon, which will immediately compile the files. Using fsc
, you only need to wait for the the JVM to startup the first time. If you ever want to stop the fsc
daemon, you can do so with fsc -shutdown
.
Running either of these scalac
or fsc
commands will produce Java class files that you can then run via the scala
command, the same command you used to invoke the interpreter in previous examples. However, instead of giving it a filename with a .scala
extension containing Scala code to interpret6 as you did in every previous example, in this case you'll give it the name of a class containing a main
method. Similar to Java, any Scala class with a main
method that takes a single parameter of type Array[String]
and returns Unit
7 can serve as the entry point to an application. In this example, WorldlyApp
has a main
method with the proper signature, so you can run this example by typing:
scala WorldlyApp
At which point you should see:
Hello, world!
You may recall seeing this output previously, but this time it was generated in this interesting manner:
- The
scala
program fires up a JVM with theWorldlyApp
'smain
method as the entry point. WordlyApp
'smain
method creates a newWordlyGreeter
instance vianew
, passing in the string"Hello"
as a parameter.- Class
WorldlyGreeter
's primary constructor essentially initializes a final field namedgreeting
with the passed value,"Hello"
(this initialization code is automatically generated by the Scala compiler). WordlyApp
'smain
method initializes a local \@val@ namedwg
with the newWorldlyGreeter
instance.WordlyApp
'smain
method then invokesgreet
on theWorldlyGreeter
instance to whichwg
refers.- Class
WordlyGreeter
'sgreet
method invokesworldify
on singleton objectWorldlyGreeter
, passing along the value of the final fieldgreeting
,"Hello"
. - Companion object
WorldlyGreeter
'sworldify
method returns aString
consisting of the value of a concatenation of thes
parameter, which is"Hello"
, and the literalString
", world!"
. - Class
WorldlyGreeter
'sgreet
method then initializes a \@val@ namedworldlyGreeting
withplaces the"Hello, world!"
String
returned from theworldify
method. - Class
WorldlyGreeter
'sgreet
method passes the"Hello, world!"
String
to whichworldlyGreeting
refers toprintln
, which sends the cheerful greeting, via the standard output stream, to you.
Step 12. Understand trait
s and mixins
As first mentioned in Step 10, Scala includes a construct called a trait, which is similar in spirit to Java's interface
. One main difference between Java interface
s and Scala's traits are that whereas all methods in Java interface
s are by definition abstract, you can give methods real bodies with real code in Scala traits. Here's an example:
trait Friendly { def greet() = "Hi" }
In this example, the greet
method returns the String
"Hi"
. If you are coming from Java, this greet
method may look a little funny to you, as if greet()
is somehow a field being initialized to the String
value "Hi"
. What is actually going on is that lacking an explicit return
statement, Scala methods will return the value of the last expression. In this case, the value of the last expression is "Hi"
, so that is returned. A more verbose way to say the same thing would be:
trait Friendly { def greet(): String = { return "Hi" } }
Regardless of how your write the methods, however, the key point is that Scala traits can actually contain non-abstract methods. Another difference between Java interface
s and Scala traits is that whereas you implement Java interfaces, you extend Scala traits. Other than this implements
/extends
difference, however, inheritance when you are defining a new type works in Scala similarly to Java. In both Java and Scala, a class can extend one (and only one) other class. In Java, an interface
can extend zero to many interface
s. Similarly in Scala, a trait can extend zero to many traits. In Java, a class can implement zero to many interface
s. Similarly in Scala, a class can extend zero to many traits. implements
is not a keyword in Scala.
Here's an example:
class Dog extends Friendly { override def greet() = "Woof" }
In this example, class Dog
extends trait Friendly
. This inheritance relationship implies much the same thing as interface
implementation does in Java. You can assign a Dog
instance to a variable of type Friendly
. For example:
var pet: Friendly = new Dog println(pet.greet())
When you invoke the greet
method on the Friendly
pet
variable, it will use dynamic binding, as in Java, to determine which implementation of the method to call. In this case, class Dog
overrides the greet
method, so Dog
's implementation of greet
will be invoked. Were you to execute the above code, you would get Woof
(Dog
's implementation of greet
), not Hi
(Friendly
's implementation of greet
). Note that one difference with Java is that to override a method in Scala, you must precede the method's def
with override
. If you attempt to override a method without specifying override
, your Scala code won't compile.
Finally, one quite significant difference between Java's interface
s and Scala's traits is that in Scala, you can mix in traits at instantiation time. For example, consider the following trait:
trait ExclamatoryGreeter extends Friendly { override def greet() = super.greet() + "!" }
Trait ExclamatoryGreeter
extends trait Friendly
and overrides the greet
method. ExclamatoryGreeter
's greet
method first invokes the superclass's greet
method, appends an exclamation point to whatever the superclass’s greet
method returns, and returns the resulting String
. With this trait, you can mix in its behavior at instantiation time using the with
keyword. Here's an example:
val pup: Friendly = new Dog with ExclamatoryGreeter println(pup.greet())
Given the initial line of code, the Scala compiler will create a synthetic8 type that extends class Dog
and trait ExclamatoryGreeter
and instantiate it. When you invoke a method on the synthetic type, it will cause the correct implementation to be invoked. When you run this code, the pup
variable will first be initialized with the new instance of the synthetic type, then when greet
is invoked on pup
, you'll see "Woof!"
. Note that had pup
not been explicitly defined to be of type Friendly
, the Scala compiler would have inferred the type of pup
to be Dog with ExclamatoryGreeter
.
To give all these concepts a try, type the following code into a file named friendly.scala
:
trait Friendly { def greet() = "Hi" } class Dog extends Friendly { override def greet() = "Woof" } class HungryCat extends Friendly { override def greet() = "Meow" } class HungryDog extends Dog { override def greet() = "I'd like to eat my own dog food" } trait ExclamatoryGreeter extends Friendly { override def greet() = super.greet() + "!" } var pet: Friendly = new Dog println(pet.greet()) pet = new HungryCat println(pet.greet()) pet = new HungryDog println(pet.greet()) pet = new Dog with ExclamatoryGreeter println(pet.greet()) pet = new HungryCat with ExclamatoryGreeter println(pet.greet()) pet = new HungryDog with ExclamatoryGreeter println(pet.greet())
When you run the friendly.scala
script, it will print:
Woof Meow I'd like to eat my own dog food Woof! Meow! I'd like to eat my own dog food!
Conclusion
As you may have glimpsed by reading this article, the promise of Scala is that you can get more productivity while leveraging existing investments. Scala's basic conciseness of syntax and support for the functional programming style promise increased programmer productivity compared to the Java langauge, while enabling you to continue to take advantage of all the great things about the Java platform. You can complement Java code with Scala code and continue to leverage your existing Java code and APIs, the many APIs available for the Java platform, the runtime performance offered by JVMs, and your own knowledge of the Java platform.
With the knowledge you've gained in this article, you should already be able to get started using Scala for small tasks, especially scripts. In future articles, we will dive into more detail in these topics, and introduce other topics that weren't even hinted at here.