Lazy Evaluations

There are two methods to evaluate an expression.

  1. Strict
  2. Lazy

The strict means, evaluate the expression now and the lazy means, evaluate it for the first use. Strict evaluation is the default behavior in most of the programming languages. However, lazy evaluation is an important idea in the functional programming world. So, a functional programming language, such as Scala, also offers the lazy evaluation as an alternative. Let me start with the Strict evaluation and show you some examples, and then we will look at the lazy evaluation.

The Strict evaluation

I will try to explain the strict behavior in three different scenarios.

  1. Strict Variable assignment
  2. Strict Function parameters
  3. Strict Function values in Higher Order functions

We will need a function to help us to explore the behavior in various scenarios. I will use the below function for that purpose.

                                        
    def factorial(i:Int):Int = {
        println(s"Starting Factorial for $i")
        def  tFactorial(n: Int, f:Int): Int = {       
                if (n <= 0)  f
                else  tFactorial(n-1, n*f)      
            }    
        return  tFactorial(i,1)
    }                
                                    

The above function is a simple tail recursive factorial. It calculates the factorial and also prints a message whenever we evaluate this function. Now, let's start with the three scenarios that I listed earlier.


Strict variable assignment

                                        
    val  s =  factorial(15)/factorial(11)
    /*Output:- 
    Starting Factorial for 15
    Starting Factorial for 11
    s: Int = 50
    */
    println(s)
    //50                 
                                    

The above Scala statement initializes a value s with the result of an expression. Like most of the languages, Scala is a strict language. So the value for s is evaluated immediately. When we use the s later in a println statement, Scala does not evaluate it again. This behavior is logical, and we have been using it for a long time in every other programming language.

Strict Function parameters

Let's try to pass the same factorial expression to a function.

                                        
    //define a new function
    def twice(i:Int ) = {
    println("We have not used i yet")
    i + i
    }
    //call the function
    twice(factorial(15)/factorial(11) )
    /*Output:- 
    Starting Factorial for 15
    Starting Factorial for 11
    We have not used i yet
    res7: Int = 100
    */                 
                                    

In the above example, we are passing an expression to the function. Scala evaluates the expression before it can pass the result to the function parameter. So, the parameter i is initialized in the beginning even before we used it, and Scala never re-evaluate the i inside the function body. There is nothing new so far, and we are already familiar with the above two scenarios.


Strict function values in Higher Order functions

Higher order functions are part of the functional paradigm. They are not available in imperative languages. So it is worth to investigate the behavior of expression evaluation in case of a higher order function. The example shown below passes the same expression to a higher order function.

                                        
    //define a HO function
    def twice(f: => Int ) =  {
    println("We have not used f yet")
    f + f
    }
    //call the function
    twice(factorial(15)/factorial(11))
    /* Output:-
    We have not used f yet
    Starting Factorial for 15
    Starting Factorial for 11
    Starting Factorial for 15
    Starting Factorial for 11
    res8: Int = 100
    */                 
                                    

Look at the output of the previous example and compare it with the above output. The previous example does following.

  1. Evaluate the factorial for 15
  2. Evaluate the factorial for 11
  3. Evaluate the expression factorial(15)/factorial(11)
  4. Assign the results to function parameter i.
  5. Execute the function body and use the evaluated value of i.

However, the Higer Order function example does following.

  1. Pass the whole expression ( factorial(15)/factorial(11) ) as a function without evaluating it.
  2. Execute the body of the function twice.
  3. Evaluate f two times as we used it two times in the body.

So, In case of a function parameter, Scala defers the evaluation for later, and it evaluates the function on every use. If we use f ten times in our function, Scala will evaluate it ten times. You may argue that such behavior is necessary for a parameter that is not a simple value but a function value. I also agree with that. However, if the parameter function is pure and it always returns the same value, this repeated execution is unnecessary and may cause a performance problem.
How can we avoid this thing?
Cache the parameters in a val. The below example shows an updated version.


                                        
    //define a function
    def twice(f: => Int ) =  {
        val i = f
        println("We have not used i yet")
        i + i
    } 
    //Call the function
    twice(factorial(15)/factorial(11))
    /*Output: - 
    Starting Factorial for 15
    Starting Factorial for 11
    We have not used i yet
    res9: Int = 100
    */                    
                                    

A simple caching can help us to avoid repeated evaluation. However, we can use it only if the function is pure. You must have noticed that f also goes through a strict evaluation. Scala evaluates it immediately as soon as we refer it. However, there is a difference. Unlike normal value parameters, Scala does not evaluate a function value parameter before passing it as an argument.

The Lazy evaluation

The strict evaluation is the default behavior in almost every programming language. However, sometimes the strictness is not desired. In some cases, we do not want that to happen. That is where the Lazy evaluations play a role. Lazy evaluations are one of the Functional Programming techniques for implementing the efficient code. So, almost every Functional Programming language supports the lazy evaluation. Scala also offers you the same. Let's try to understand the lazy behavior in the same three different scenarios.

  1. Lazy variable assignment
  2. Lazy Function parameters
  3. Lazy Function values in Higher Order functions

Let's start with the lazy variable assignment.


Lazy variable assignment

                                        
    lazy val  l =  factorial(15)/factorial(11)
    //Output:- l: Int = <lazy>
    //now use the value l
    println(l)
    /* Output:-
    Starting Factorial for 15
    Starting Factorial for 11
    50
    */                    
                                    

We used the same expression earlier without the keyword lazy. However, this time, Scala does not evaluate the value of l because we made it lazy by using the lazy keyword. Scala will not evaluate the expression until we use the l. So, when we print the l, it evaluates the expression.
So, what is a lazy evaluation?
Evaluate the values at the time of its first use.

Lazy function parameters

There is no way to make a parameter value lazy because it does not make any sense. If you want to pass an expression to a function as a parameter and hold the evaluation of the expression for later, use a higher order function. That is what a higher order function does by default.


Lazy function values in Higher Order functions

We created one Higer Order function example earlier and cached the function value. However, we learned that Scala evaluates the function immediately as we cache it. In that scenario, even if we do not use the cached value, the function gets evaluated at the time of caching.
The following example uses a lazy val to cache the function value. This simple technique makes the function value lazy, and instead of immediate evaluation, the function value is evaluated once at the time of its first use.

                                        
    //define the function
    def twice(f: => Int ) =  {
    lazy val i = f
    println("We did not use i yet")
    i + i
    }
    //call the function
    twice(factorial(15)/factorial(11))
    /* Output:-
    We did not use i yet
    Starting Factorial for 15
    Starting Factorial for 11
    res1: Int = 100
    */                    
                                    

Benefits of lazy evaluation

Are you wondering about the benefits? Why would you want to use the lazy evaluations? The laziness is mostly used to create data structures to handle large data volumes efficiently. Apache Spark libraries are the most common examples. If you know Spark RDD, they implement all transformations as lazy operations. Let's take an example to understand the benefits. I have a log file. There are different types of entries. Every line in the file has a record type of notice, error or info. We want to extract first two error messages.
Here is a simple Scala code to do that.

                                        
    import scala.io._
    val s = Source.fromFile("error_log").getLines().toList.filter(_.contains("[error]")).take(2)
    s foreach println                    
                                    

The above code gives me what I was looking for. However, did you notice that how this code works? We open the file, read all the lines and convert it to a list. Then we filter for all the lines that contain error. Finally, we take only first two of those lines. Assuming that my file is large, maybe few GBs, Is this an efficient method? I mean, I can create a simple loop. Read the file line by line until I get two error lines, and stop as soon as I get two error messages. That could be more efficient. Isn't it. I may get first two error messages at the beginning and avoid time and memory to process rest of the data. However, the example shown above is reading all the lines. It happens because the List is a strict data structure. When we call toList function, It evaluates immediately. So, we get all the lines from the file even before we apply a filter to it. Scala gives you a lazy alternative to lists. They call it Stream.
Here is an example that uses a stream instead of a list.

                                        
    val s = Source.fromFile("error_log").getLines().toStream.filter(_.contains("[error]")).take(2)
    s foreach  println                    
                                    

You might not notice the difference by simply using the above code. Let's execute the code in small parts and try to understand the difference.

                                        
    Source.fromFile("error_log").getLines().toStream
    /*
    res2: scala.collection.immutable.Stream[String] 
    = Stream([Sun Mar  7 16:02:00 2004] [notice] Apache/1.3.29 (Unix) configured -- resuming normal operations, ?)
    */                
                                    

The above code creates a stream. It evaluates just one element. The next item appears as ?. The ? indicates that the stream has not yet evaluated the next element because the stream is a lazy data structure and toStream is a lazy function. Scala will access the next line when we use the next line.

                                            
    Source.fromFile("error_log").getLines().toStream.filter(_.contains("[error]"))
    /*
    res3: scala.collection.immutable.Stream[String]
     = Stream([Sun Mar  7 21:16:17 2004] [error] [client 24.70.56.49] File does not exist: /home/httpd/twiki/view/Main/WebHome, ?)
    */                    
                                        

The next operation is the filter. When we apply a filter, the stream will start reading next line until we get the first line with an error message. However, the stream stops there. You can see the question mark once again.

                                                
    Source.fromFile("error_log").getLines().toStream.filter(_.contains("[error]")).take(2)
    /*
    res4: scala.collection.immutable.Stream[String] 
    = Stream([Sun Mar  7 21:16:17 2004] [error] [client 24.70.56.49] File does not exist: /home/httpd/twiki/view/Main/WebHome, ?)
    */                
                                            

When we say, take two, it still does not perform any action. It knows that it should pick just two such lines, but it does not evaluate until we use both the lines. When we print both the lines, it will evaluate both the lines and complete the stream object.


                                                    
    s foreach  println
    /*
    [Sun Mar  7 21:16:17 2004] [error] [client 24.70.56.49] File does not exist: /home/httpd/twiki/view/Main/WebHome
    [Mon Mar  8 07:27:36 2004] [error] [client 61.9.4.61] File does not exist: /usr/local/apache/htdocs/_vti_bin/owssvr.dll       
    */
                                                

The lazy behavior shown above is incredibly powerful. It not only delays the evaluation of individual operations but also combines them. You must have noticed the combining behavior in this example. The Stream reads one line. Then it applies the filter on that particular line. If the line contains error, it takes the line and moves to the next line. This behavior is opposite to the way list works. The list gets all the lines. Then it applies the filter to all the lines. The list is evaluating one operation at a time for all the lines. However, the Stream combines multiple operations and implement all of them together on each line. This kind of approach gives an advantage when we are dealing with large volumes of data and applying a series of operations.
In fact, it is so powerful that it allows us to create a dataset of infinite length. We evaluate what we need, but the data set can be of an unlimited range.
Let me show you a simple example.
I hope you are familiar with Fibonacci series. We need to initialize the Fibonacci series with first two numbers, and then it can go on to infinity. Let's create a function in Scala that can generate a list of all Fibonacci numbers.
Can we do this?
If we use a strict evaluation, the function will run out of memory for sure. The reason is straight. We cannot create a list of infinite elements. We do not have a computer capacity for that. However, we can do it using a lazy list because a lazy list will not evaluate all of the elements.
Here is the code.

                                                            
    def fibFrom(a: Int, b: Int): Stream[Int] = a #:: fibFrom(b, a + b)
    //You can create a series by calling a function.
    val f = fibFrom(1,2)
    //evaluate only as many as you want.
    f.takeWhile(_<10) foreach println       
                                                        

The function takes two parameters, and they are the first two numbers of the series. The function returns a stream. The body of the function is simple. The symbol # is used to create a stream. We place # symbol between two elements of the stream. So a is the first element, and we get next element using the recursion. This recursion goes on forever because we do not have a termination condition. The laziness is one of the powerful features of Functional Programming languages like Scala.
Keep reading for more interesting functional concepts.

Read More

Pure Functions | Referential Transparency | Benefits of pure functions | First class functions | Higher order function | Anonymous functions | Immutability | Tail Recursion | Expressions in Scala | Lazy Evaluations | Pattern Matching | Closures

By Prashant Pandey -


You will also like:


Scala named arguments

Learn about named arguments and default values in Scala functions with examples.

Learning Journal

First Class Functions

Function is a first-class citizen in functional programming. What does it mean?

Learning Journal

Pure Function benefits

Pure Functions are used heavily in functional programming. Learn Why?

Learning Journal

Free virtual machines

Get upto six free VMs in Google Cloud and learn Bigdata.

Learning Journal

Referential Transparency

Referential Transparency is an easy method to verify the purity of a function.

Learning Journal