Scala Foundation Course - Functional Elements Part 1


Scala is a hybrid language. It supports Object oriented programming and also supports functional programming paradigm. But a data engineer or a data scientist is more inclined towards using Scala as a functional programming language, and hence the tutorial will be leaning towards functional programming using Scala. So, before we start learning Scala language elements, it is a good idea to cover the elements of functional programming. So, in this lesson, I will talk about the elements of functional programming and try to build a high-level concept. Let's start.

Who coined the term FP?

John Backus. He was the first person to use the term FP in 1977. He used this term in his AM Turing Award Lecture .

What is functional programming?

There are disagreements about the answer to this question. But at the simplest premise, here is the definition.

Functional programming is a way of writing software applications using only pure functions and immutable values.

What does it mean?
I will explain it, but It may take some time and practice to understand the real meaning and implications of this definition. This definition is reasonably good. However, the pure function and immutable values are just two elements of the functional paradigm. There are many others and here is a list of top 10 FP concepts.

  1. Pure functions and side effects
  2. Referential transparency
  3. First class functions & higher order functions
  4. Anonymous functions or lambda
  5. Immutability
  6. Recursion & tail recursion
  7. Statements
  8. Strict and non-strict (lazy) evaluation
  9. Pattern matching
  10. Closures

This list covers the most frequently referred elements. A high-level understanding of all these items is essential to reason the constructs of a functional programming language. I mean, when you learn the various Scala features, you should be able to relate and justify them based on the elements listed here. This list also answers another question.


Why functional programming?

If you understand the benefits that those ten items bring on the table, you know the answer to your question. The greatest benefit of FP is that it brings a different approach and facilities to solve your problems. By the end of this section, you will learn many of those facilities and the overall approach. When you start practicing FP, you will realize that the approach fits better in many scenarios. Library design and data crunching are among those problems. That's the main reason the Spark creators picked Scala to implement Spark libraries.
So, with the stage set, let's take these top ten FP elements one by one.

What is a pure function?

Let me ask you a simple question. What is a function? Let's take the definition of a function from the mathematics because all mathematical functions are always pure.
A function relates an input to an output. So, there are three main parts.

  1. The input
  2. The relationship
  3. Output

Here is an example.

The input is 4, and the output is 2. Where is the relationship? The actual code of the sqrt function is the relationship, and that's a critical part. It is the relationship that determines the output based on the input. There are few observations about this mathematical function.

  1. The Input solely determines the output -
    I mean, there is no other thing like a global variable or the content of a file, or input from the console that determines the output. It's only the input parameter value, nothing else. No matter how many times or where do you invoke this function, as long as the input parameter value is same, you are going to get the same output. That's the first feature of a pure function.
  2. The function doesn't change its input -
    In some of the programming languages, you might have heard about the call by value or call by reference. All that confusion is out of the scope from a pure function's standpoint because it guarantees that the input value remains unchanged. It never modifies the Input value. That's the second quality of a pure function.
  3. The function doesn't do anything else except computing the output -
    I mean, it doesn't read anything from a file or console. It doesn't print anything on the console or write some data to a file. It doesn't read or modify a global variable or for that matter anything outside the function. In fact, it doesn't perform any I/O. A pure function is like a special purpose machine. Takes the input, computes the output, returns it and that's all. No other work. If it does anything else that impacts the outside world or is visible to outside world, we call it a side effect of the function. A side effect is like doing something other than your primary purpose. So, a function is pure if it is free from side effects. That's the third quality.

That's all. If a function qualifies on these three conditions, it is a pure function.
Is there an easy method to validate the purity of a function?
Yes. Test it for the referential transparency.

What is referential transparency?

A function is said to be referentially transparent if we can replace it with a corresponding value without changing the program's behaviour.
So, can you replace all references of the Math.sqrt(4) with its corresponding value 2 in your code.
The answer is obviously Yes. If we know that input value is 4, we can replace Math.sqrt(4) with 2.
Will it change the behaviour of your program? The answer is obviously No.
So, the sqrt is a pure function because we can change its references with the output value as long as input value is same.
Let me give you another example. I haven't covered anything about Scala yet, but don't get trapped into the syntax. We will learn the syntactical part of Scala as we progress. But for now, just focus on the concept.


The output of first rt(5) is 15 and the second rt(5) is 20. Can I replace all references of rt(5) with 15 or 20? No. Right?
So, rt(i:Int):Int does not qualify to referential transparency. It's not a pure function as well for two reasons.

  1. The rt is dependent upon global variable g. So, the output is not solely determined by the input parameter.
  2. It modifies an external variable, so it has a side effect.

So, the referential transparency fails for this function and It's not a pure function.

Pure function summary

A pure function follows these rules.

  1. The output is only dependent on input parameter values.
  2. The function doesn't modify the Input parameter values.
  3. The function doesn't have a side effect.

Few other things that you should note.
A function is referentially transparent if evaluating it gives the same value for same arguments. You can test the purity of a function using referential transparency.
Now, let's come to the benefits.

Why pure functions?

  1. Pure functions encourage safe ways of programming –
    I think you already agree with this statement because side effects are surprising for everyone except the original coder. Pure functions are small, precise, simple, safe and easy to reuse because you know that they take input and give output based on the input values, that's it. They don't do anything else. They don't surprise you.
  2. Pure functions are more composable or modular –
    It is very common in FP to combine many functions into a simple solution. For example, you’ll often see FP code written as a chain of function calls, like this:

You can do this with non-pure functions as well, but it is easy for pure functions because they don't have side effect and output depends only on input values. We refer this capability as functional composition, and you can compose more confidently if you know that there are no side effects.

  1. Pure functions are easy to test –
    I think you will agree with this as well. Since there are no side effects and the output only depends on input, your test cases are straight forward. You just pass the known value and assert for the expected value, whereas simulating the side effects is a real challenge for testing. For example, you have two functions. The first function is not pure, so it prints 'Hello World' on the console. The second function is pure, so it doesn't print anything on console but returns 'Hello World'. You can quickly assert the return value, however, asserting the console output will be a complex thing.
  2. Pure functions are memoizable –
    The memoization is nothing but caching of deterministic functions. If you know that your function is pure, and you will need the results again, you can cache the output, or your compiler can also do that as an optimization. But if you have side effects, you can't cache the results for later use.
  3. Pure functions can be lazy –
    This one is a big advantage, and we have listed it as a separate topic in our top 10 list. I will cover it later when we reach to the laziness.

Great. Let's move on to the next item.


What is a first-class Function?

If you can treat a function as a value, it is a first-class function.
That's a simple definition but what does it mean?
That means you should be able to do everything with the function that you can do with a value.

  1. You can assign it to a variable - You can assign a value to a variable, so you should be able to assign a function to a variable.
  2. You can pass it as an argument to other functions - You can pass a value to a function as an argument, so you should be able to pass a function as well.
  3. You can return it as a value from other functions - You can return a value from a function, so you should be able to return a function also.

If you can do those three things with a function, it's a first-class function. In Scala, all functions are first class functions by default. That's a feature of Scala. You don't need to test if your function is a first class or not. We will see some examples but before that, let me cover the next jargon.

What is a Higher Order Function?

A function that does at least one of the following is a Higher Order Function.

  1. Takes one or more functions as arguments.
  2. Returns a function as its result.

So, let's assume I have a function F1. By default, F1 is a first-class function, so I should be able to do those three things with F1. Right?
Now, let's assume I have another function F2. If F2 can take F1 as an argument, F2 is a higher order function. Right? That's what the definition says.
Now, let's take a working example. You can ignore the syntax for now and just focus on the concept.

So, doubler is my function. It takes an integer as an input and returns a value by doubling it. It's a first-class function by default.
Now, let's see how we can apply those three things to this function.

Next one is to pass it to another function as an argument

So, r is a Scala collection of type Range. There are many methods defined on the Range collection. One of them is a map function. It takes a function as an argument then it applies the function to all elements of the range. So, if we pass the doubler function there, it should double each of the elements.

The output shows that every item is double. Since we assigned the doubler function to a variable, you should have been able to pass the variable as well.

So, we applied first two things on our function. And you must have realized that the map function is a higher order function because it took a function as an argument. Finally, we need an example of returning a function. But before we move further, let me quickly summarize.
All functions in Scala are first class. So, you can do following things with a function in Scala.

  1. Assign it to a variable.
  2. Pass it as an argument to other functions.
  3. Return it as a value from other functions.

A Function that does at least one of the following is a Higher Order Function.

  1. Takes one or more functions as arguments
  2. Returns a function as its result

Scala allows you to create higher order functions.
Good. We have seen everything in action except an example of a function returning another function. I will come to that but let me cover another jargon.

What is an Anonymous Function?

A standard function has a name, a list of parameters, a return type, and a body. Right? If you don't give a name to a function, it's an anonymous function. Simple, isn't it? Let me show you one simple example.

The above code is the syntax for the anonymous function. First part is the list of parameters, then body and return type. That's it. No name. The return type is optional. So, if you leave it, Scala will automatically infer it.
If you compare the syntax for an anonymous Scala function with a standard Scala function, you may notice one main difference. A normal function uses = symbol before the body whereas an anonymous function uses => symbol. That entire syntax has another term - A function literal, and the syntax is known as a function literal syntax.
Ok, so you might be wondering how to call this function if it doesn't have a name. Well, you can assign it to a variable.

But that's not the real purpose of an anonymous function. If we wanted to assign it to a variable and call it later, why do we create an anonymous function? What's wrong with the named function. Right? I mean, a named function or an anonymous function assigned to a variable is the almost same thing. So, the big question is this.

What is the purpose of an anonymous function?

Why would you want to create an anonymous function? Well, the answer is simple.
There might be scenarios where you want to create an inline function for a one-time usage, and giving them a name doesn't make any sense, because you don't want to use them anywhere else. In those cases, creating an anonymous function is quite convenient. Let me show you an example. I am going to create a function that returns a function. Pay a little attention to the syntax because the syntax for creating a function that returns another function is little tricky.

I am creating a function getOps. It takes a parameter of type Int. In a standard function, you should define the return type after a colon and then an = symbol, and finally the body. Right?
I am skipping the colon and the return type for simplicity. I can skip that because Scala will infer the return type automatically. Place an = symbol and use the function literal syntax.
We used function literal syntax for creating an anonymous function. I am using similar thing in the above code. A parameter list and the body separated by => symbol. If you are still not clear about the syntax, just ignore it because we will be doing a lot of these things as we progress with the tutorial.
Then, I create two local functions doubler and tripler. If the input parameter c is a positive value, return doubler else return tripler. Simple, isn't it?
The getOps function takes an integer c. If the value of c is positive, it returns a doubler else it returns a tripler. Let's test it.

What do you think, what should happen in the above code? The getOps is a higher order function, and it returns a tripler function. The map is another higher order function which takes the tripler function and applies to all elements of the range. It should triple the entire range.
But the point that I was trying to explain was this.
Why would you want to create an anonymous function?
Let me explain that now. Just look at the code of the getOps function. I can write it like this.

Does it look little straightforward and clean than the earlier version? Instead of defining those two functions and then returning them later, I simply use anonymous function body right at the place where it is needed. This kind of code is more convenient to write and easy to understand.You may have several such scenarios that you just want to create a function and use it right there. Anonymous functions are there to allow you to do that.
So, now you understand the first-class function, higher order functions, anonymous function and a function literal. The function literal might be little confusing. But you can plainly say that the signature and the body of a function code is a function Literal. That's it.

Why functional programming?

Now you must be expecting to understand the benefits. What is the purpose of all this? I mean, we can pass functions and return them. But why would I want to do that? I am more than happy to use object-oriented techniques. What is so great about the idea of passing around functions?
Abstraction is the main benefit of HO functions. Abstraction makes the application easily extendable. When developing with a higher level of abstraction, you communicate the behaviour and less the implementation. Let me show you a simple example.
Let's say you have an array of customers.

Now you want to send a greeting message to all these clients. The most obvious method to do this is a loop. It may be something like this.

It's a quite simple code. For each customer, we want to call a function println. Instead of println, you may have some other function. For example, payment reminder.

So, what are we doing? In fact, we want to say this.
For each customer, remind payment.
If you design this using a higher order function. Your final code will look like this.

This line does the same thing. But I have abstracted the implementation details behind a higher order function. The forEach is a higher order function, and it takes two parameters. I will show you the code in a minute but let me make some points before we look at the code.
This kind of programming approach is more precise, easy to understand and more reusable. I mean, I can reuse the same function to send payment to all my vendors.

Now, let's look at the code.

The forEach function takes two parameters. The first parameter is an array of strings. The second parameter is a function. We loop through the array and call the function for each element. That's it.
In fact, Scala provides this kind of iterator method for every collection. So, you don't need to implement a forEach function. Let me show you.

It's like take customers, and for each customer, remind payment. We will learn more about iterators later in the tutorial.
Great. So, we learned first-class functions, higher order functions, and their benefits. I will cover some more functional elements in the next lesson.
Thank you for visiting Learning Journal. Keep Learning and Keep growing.



You will also like: